charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Phil Miller <mille121 AT illinois.edu>
- To: Robert Steinke <rsteinke AT uwyo.edu>
- Cc: Charm Mailing List <charm AT cs.illinois.edu>
- Subject: Re: [charm] Program hang when using load balancing and lots of PEs
- Date: Tue, 27 Jan 2015 16:14:34 -0600
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
The first thing to try would be running with the option "+LBDebug 3" to get some visibility into what's happening in the LB infrastructure. Could you send us output from such a run?
Also, how many objects are you running with across the whole job?
Also, how many objects are you running with across the whole job?
On Tue, Jan 27, 2015 at 3:51 PM, Robert Steinke <rsteinke AT uwyo.edu> wrote:
I have a program that hangs when I run on lots of PEs and use the load balancer (I'm using MetisLB). If I run on 512 or fewer processors it is fine. If I try to run on 1024 processors it hangs shortly after I call CkStartLB (I'm using TurnManualLBOn). Also, if I don't call CkStartLB(); it runs fine on 1024 processors.
Is this a problem that someone else has encountered before?
Is this something that I should try to dig into, or is there someone else more familiar with the load balancer than I am who is willing to look into it, in which case I will apply my effort to creating a minimal test case that reproduces the problem.
Thanks
Bob Steinke
_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm
- [charm] Program hang when using load balancing and lots of PEs, Robert Steinke, 01/27/2015
- Re: [charm] Program hang when using load balancing and lots of PEs, Phil Miller, 01/27/2015
- Message not available
- Re: [charm] Program hang when using load balancing and lots of PEs, Phil Miller, 01/27/2015
- Message not available
- Re: [charm] Program hang when using load balancing and lots of PEs, Phil Miller, 01/27/2015
Archive powered by MHonArc 2.6.16.