charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Eric Bohm <ebohm AT illinois.edu>
- To: <charm AT cs.uiuc.edu>
- Subject: Re: [charm] Debugging Race Conditions
- Date: Tue, 19 Aug 2014 09:46:30 -0500
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
Race conditions are one of the most
difficult kinds of bug to unravel.
The record/replay feature is designed to help under these condition. A run with +record will create files which record the exact order of events. A run with +replay will replay execution from the event log created by the +record feature. That way one can issue multiple runs with +record until the sought after condition occurs and +replay the target within a debugger. Can you get ++debug to work for any charm program? When applicable, it does rely on ssh and xforwarding working correctly, so sometimes this issue can be resolved by adding this line to your .ssh/config: ForwardX11 yes or by adding -X to the ssh command line in the nodelist file. On 08/18/2014 02:51 PM, Robert Bird wrote: Hey all
I've got a (rare) race condition, where by a charm element is inserted twice (according to the error int he stack trace when Charm aborts). I can only get this to happen in parallel, with random
message queues, so I'm having a hard time tracking it down.
Is there an obvious way to debug race conditions such as this? I've tried to use ++debug in order to get reliable access to the trace in gdb, but it doesn't seem to launch quite as expected. I get debug prints about the threads at the start and the program runs, but no xterm window appears (nor waits for my input to start -- As far as I can tell I meet all the requirements, I can spawn X-window, $DISPLAY is set, xterm is in path.) Any obvious pointers/hints? Especially about a general
method for tracking down race conditions
Thanks Bob NB: During a quick chat with Phil Miller he mentioned
+record, does this allow me to record it in parallel, then
replay on a serial gdb?
--
Robert
Bird
http://go.warwick.ac.uk/robertbird
+44 (0)24 7652 2863 CS202, High Performance Lab Department of Computer Science University of Warwick _______________________________________________ charm mailing list charm AT cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/charm |
- [charm] Debugging Race Conditions, Robert Bird, 08/18/2014
- Re: [charm] Debugging Race Conditions, Eric Bohm, 08/19/2014
- Re: [charm] Debugging Race Conditions, Robert Bird, 08/19/2014
- Re: [charm] Debugging Race Conditions, Robert Bird, 08/19/2014
- Re: [charm] Debugging Race Conditions, Robert Bird, 08/19/2014
- Re: [charm] Debugging Race Conditions, Eric Bohm, 08/19/2014
Archive powered by MHonArc 2.6.16.