charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Edgar Solomonik <solomon AT eecs.berkeley.edu>
- To: PPL <ppl AT cs.uiuc.edu>, charm AT cs.illinois.edu
- Subject: [charm] exploiting multi-link bandwidth of on Blue Gene
- Date: Thu, 29 Dec 2011 14:38:41 -0800
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
Hello,
I've been working on an algorithm that tries to exploit the bandwidth of every link on a torus network. Basically, each node sends data to neighbors in each dimension of the torus, rather than a single dimension. The target is to achieve injection bandwidth rather than link bandwidth on torus networks (e.g. on BG/P injection bandwidth is 6x link bandwidth, and on BG/Q injection bandwidth is 10x link bandwidth).
I have been able to employ this idea to get a significant performance imporvement on BG/P for an implementation of matrix multiplication that uses MPI_Put. I also wanted virtualization so I implemented the same algorithm in Charm++. The topology-aware mapping works and the Charm++ version performs almost as well as the original MPI version. However, I've been unable to get the Charm version to saturate multiple links on BG/P. I even tried running with virtual node mode on BG/P and having chares on different processes within the same node send messages along different torus directions.
To summarise, I want to have a chare send multiple simultaneous messages to chares located on torus neighbors in different directions. Can I use CkDirect or some other technique to achieve the above goal in Charm++? Basically, I need an asynchronous (one-sided) send implementation on BG/P.
Thanks,
Edgar
- [charm] exploiting multi-link bandwidth of on Blue Gene, Edgar Solomonik, 12/29/2011
Archive powered by MHonArc 2.6.16.