charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: "Sangamesh B" <forum.san AT gmail.com>
- To: "Charm ML" <charm AT cs.uiuc.edu>
- Cc: Charm++ M L <ppl AT cs.uiuc.edu>
- Subject: [charm] charm++ + namd: fail to run 128 core job
- Date: Wed, 13 Aug 2008 12:33:31 +0530
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
Hi all,
I've built charm++ 6.0 version with mvapich2 library , intel compilers
on Rocks 4.3, 33 node cluster ( Dual processor, Quad core Intel Xeon: Total
264 cores ).
NAMD is also built as Linux-mvapich2.
The scaling is good from: 8 to 16, 16 to 32, 32 to 64. But when 128 core job
is submitted, the job fails.
#mpirun -machinefile ./machfile -np 128
/data/apps/namd26_mvapich2/Linux-mvapich2/namd2 ./apoa1.namd | tee
namd_128cores
Charm++> Running on MPI version: 2.0 multi-thread support: 0/0
rank 65 in job 4 master_host_name_50238 caused collective abort of all
ranks
exit status of rank 65: killed by signal 9
The input file is the standard benchmark file which is available on the NAMD
website, i.e. apoa1.tar.gz.
According to the benchmark results given on the site, say that it
runs/scales upto 256 processors.
But in my case, its even not running for 128 cores.
But other applications such as Amber 9 and Gromacs work for upto 256
processors. Means there is no problem with mvapich2.
So, what went wrong?
Thanks,
Sangamesh
- [charm] charm++ + namd: fail to run 128 core job, Sangamesh B, 08/13/2008
Archive powered by MHonArc 2.6.16.