charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Eric Bohm <ebohm AT illinois.edu>
- To: Alexander Frolov <alexndr.frolov AT gmail.com>
- Cc: charm AT cs.uiuc.edu
- Subject: Re: [charm] Profiling and tuning charm++ applications
- Date: Thu, 23 Jul 2015 13:34:55 -0500
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
Hi Alex, On 07/23/2015 05:29 AM, Alexander
Frolov wrote:
Hello Eric,
On Wed, Jul 22, 2015 at 7:50 PM, Eric
Bohm <ebohm AT illinois.edu>
wrote:
Hello Alex,
Charm++ applications can easily reach peak utilization. However, there are a number of factors which may be affecting your performance. The MPI target for Charm++ is one of the simplest to build, but it is unlikely to be the one that gives the best performance. For single node scalability you will probably experience better performance using a different target. Try multicore-linux64. I am targeting on scaling it on infiniband cluster,
single smp performance is not that interesting, that's why
I am using it with mpicc. By the way, do charm++ runtime
support combination of multicore and mpi?
Yes. If you add smp to the build line, you will have a version which allows multiple worker threads in a process along with a distinct communication thread. Typical usage would be to indicate the number of worker threads via the +ppn parameter. That number should be chosen to be one fewer than the number of execution threads so that the communication thread can use the remaining resource. FYI: Best performance on infiniband is usually in the verbs-linux-x86_64-smp build. It is difficult to diagnose your specific problem in the abstract, however the most common cause for poor single core utilization is overly fine granularity in simulation decomposition. Experiencing a substantial drop from 1 to 2 cores suggests a load imbalance issue may also be present, however I recommend you examine compute granularity first. A modest increase in work per chare is likely to help. The Projections tool can be used to evaluate the current situation. Thank you for your suggestion. It is true that my
application is very fine-grained. Unfortunately, even
modest increase in granularity requires reimplementing and
even rethinking of algorithm. But I will try it anyway.
What I do not understand is low utilization of cpu
cores, which as I think should not be connected to charm++
application (and even runtime), but depend only on time
mpi-proccesses been running on cpus.
If you examine the time profile graph of your performance it will distinguish between time allocated by your entry methods (various colors) and time spent handling message packing/unpacking (black at top). The combination of the total is the overall utilization. If the messaging overhead portion is a substantial fraction of the overall, then you have a granularity problem. If refactoring for smaller granularity is very difficult, you may wish to look into using the TRAM library (see the manual appendices) as it will aggregate messages in a way that helps reduce the overhead of processing many tiny messages with tiny execution granularity. Regarding process switching, you can force affinity by appending the +setcpuaffinity flag, and specifically choose bindings by using the +pemap
L[-U[:S[.R]+O arguments. See
section C.2.2 of the manual (http://charm.cs.illinois.edu/manuals/html/charm++/manual.html)
for details.I am using mpirun (which is actually a script of task
manager). The custom task manager on the system I use does
not support another ways of running applications.
Thank you!
The + arguments are parsed by your application as a function of building it via the charm library, not parsed by mpirun. Cpuaffinity can be set that way. On 07/22/2015 11:24 AM, Alexander Frolov wrote:
|
- [charm] Profiling and tuning charm++ applications, Alexander Frolov, 07/22/2015
- Re: [charm] Profiling and tuning charm++ applications, Eric Bohm, 07/22/2015
- Re: [charm] Profiling and tuning charm++ applications, Alexander Frolov, 07/23/2015
- Re: [charm] Profiling and tuning charm++ applications, Eric Bohm, 07/23/2015
- Re: [charm] Profiling and tuning charm++ applications, Alexander Frolov, 07/23/2015
- Re: [charm] Profiling and tuning charm++ applications, Kale, Laxmikant V, 07/23/2015
- Re: [charm] Profiling and tuning charm++ applications, Alexander Frolov, 07/23/2015
- Re: [charm] Profiling and tuning charm++ applications, Eric Bohm, 07/23/2015
- Re: [charm] Profiling and tuning charm++ applications, Alexander Frolov, 07/23/2015
- Re: [charm] Profiling and tuning charm++ applications, Eric Bohm, 07/22/2015
Archive powered by MHonArc 2.6.16.