charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
[charm] Running NAMD 2.9 with CUDA stalls when built on CharmArch gemini_gni-crayxe-persistent-smp with largepages
Chronological Thread
- From: "Lai, Jonathan" <jlai7 AT illinois.edu>
- To: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
- Subject: [charm] Running NAMD 2.9 with CUDA stalls when built on CharmArch gemini_gni-crayxe-persistent-smp with largepages
- Date: Thu, 6 Sep 2012 17:22:24 +0000
- Accept-language: en-US
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
Dear PPL,
I am currently trying to run NAMD 2.9 with CUDA on a Cray-XE machine (titan Dev). If I run my calculation on 5 nodes, the calculation completes; however, if I increase my job to 10 nodes; then the calculation stalls par tof the way through without any error message; I do not know if this is an error with NAMD or charm++ which why I am emailing.
I have built charm++ using the following commands:
1) module load craype-hugepages8M
2) Change #define LARGEPAGE 0 to #define LARGEPAGE 1
3) ./build charm++ gemini_gni-crayxe smp persistent --no-build-shared --with-production
and running NAMD 2.9 without CUDA against this charm++ does not generate any of the above problems. I only get the stalling error when running the CUDA version of NAMD. I have not encountered this problem when running with LARGEPAGE turned off. Any thoughts?
Cheers,
Jonathan Lai
jlai7 AT illinois.edu
I am currently trying to run NAMD 2.9 with CUDA on a Cray-XE machine (titan Dev). If I run my calculation on 5 nodes, the calculation completes; however, if I increase my job to 10 nodes; then the calculation stalls par tof the way through without any error message; I do not know if this is an error with NAMD or charm++ which why I am emailing.
I have built charm++ using the following commands:
1) module load craype-hugepages8M
2) Change #define LARGEPAGE 0 to #define LARGEPAGE 1
3) ./build charm++ gemini_gni-crayxe smp persistent --no-build-shared --with-production
and running NAMD 2.9 without CUDA against this charm++ does not generate any of the above problems. I only get the stalling error when running the CUDA version of NAMD. I have not encountered this problem when running with LARGEPAGE turned off. Any thoughts?
Cheers,
Jonathan Lai
jlai7 AT illinois.edu
- [charm] Running NAMD 2.9 with CUDA stalls when built on CharmArch gemini_gni-crayxe-persistent-smp with largepages, Lai, Jonathan, 09/06/2012
Archive powered by MHonArc 2.6.16.