charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Phil Miller <mille121 AT illinois.edu>
- To: Scott Field <sfield AT astro.cornell.edu>
- Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
- Subject: Re: [charm] memory management errors after ckexit called
- Date: Tue, 16 Jun 2015 15:25:45 -0500
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
Hi Scott,
This list is definitely an appropriate place to post about potential bugs. Thanks for bringing it up.On Tue, Jun 16, 2015 at 3:14 PM, Scott Field <sfield AT astro.cornell.edu> wrote:
Hi,Recently, after pulling a bleeding-edge version of the charm++ code, all of our regression tests now fail with either a segmentation fault or "double free or corruption (!prev): 0x0000000001c4de20 ***". The error appears to occur after ckexit is called. Charm++ was built on my laptop with>>> ./build charm++ multicore-linux32 gcc --with-production -j3 -std=c++11Using git's bisect utility, I was able to track down the first commit version where things go wrong. The git hash and commit messages are c96750026bbc7a9190f1381e7ac9ea56ae86f80e and "Bug #695: disable comm thread in multicore builds". More specifically, if I edit line 200 of the file src/arch/util/machine-common-core.c from "#define CMK_SMP_NO_COMMTHD CMK_MULTICORE" to "#define CMK_SMP_NO_COMMTHD 0" the error message goes away and all tests pass again.Honestly I don't really know what why this change fixed the problem -- its pretty far under-the-hood.A few questions:1) Is this list a appropriate place to post information about potential bugs?2) Does this seem to be a charm++ bug introduced by that commit? Or a fix which has simply broken our code? I had a hard time tracking down the source of the error. Oddly enough, I could not reproduce the same error when using valgrind (although it did report an "Uninitialised value was created by a stack allocation" which it tracked to one of the declaration files created by charmc). With MALLOC_CHECK_ set to 3 I get the following*** Error in `./Evolve1DScalarWave': free(): invalid pointer: 0x000000000203c920 ***======= Backtrace: =========/lib/x86_64-linux-gnu/libc.so.6(+0x7338f)[0x7f4cebc2e38f]/lib/x86_64-linux-gnu/libc.so.6(+0x81fb6)[0x7f4cebc3cfb6]/lib/x86_64-linux-gnu/libc.so.6(+0x3c280)[0x7f4cebbf7280]/lib/x86_64-linux-gnu/libc.so.6(+0x3c2a5)[0x7f4cebbf72a5]./Evolve1DScalarWave[0x670b4a]./Evolve1DScalarWave[0x5e39ed]./Evolve1DScalarWave(CsdScheduleForever+0x48)[0x673e88]./Evolve1DScalarWave(CsdScheduler+0x2d)[0x67413d]./Evolve1DScalarWave(_ZN12ElementChareI16ScalarWaveSystemILi1EEE11endTimeStepEv+0x448)[0x580d3c]./Evolve1DScalarWave(_ZN12ElementChareI16ScalarWaveSystemILi1EEE13endComputeRhsEv+0x5331DScalarWave': free(): invalid pointer: 0x000000000203c920 ***Best,Scott
_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm
- [charm] memory management errors after ckexit called, Scott Field, 06/16/2015
- Re: [charm] memory management errors after ckexit called, Phil Miller, 06/16/2015
- Re: [charm] memory management errors after ckexit called, Scott Field, 06/17/2015
- Re: [charm] memory management errors after ckexit called, Phil Miller, 06/19/2015
- Re: [charm] memory management errors after ckexit called, Phil Miller, 06/20/2015
- Re: [charm] memory management errors after ckexit called, Scott Field, 06/21/2015
- Re: [charm] memory management errors after ckexit called, Phil Miller, 06/21/2015
- Re: [charm] memory management errors after ckexit called, Scott Field, 06/21/2015
- Re: [charm] memory management errors after ckexit called, Phil Miller, 06/20/2015
- Re: [charm] memory management errors after ckexit called, Phil Miller, 06/19/2015
- Re: [charm] memory management errors after ckexit called, Scott Field, 06/17/2015
- Re: [charm] memory management errors after ckexit called, Phil Miller, 06/16/2015
Archive powered by MHonArc 2.6.16.