charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Abhishek Gupta <gupta59 AT illinois.edu>
- To: Michel Espinoza-Fonseca <mef AT ddt.biochem.umn.edu>
- Cc: "charm AT cs.illinois.edu" <charm AT cs.illinois.edu>
- Subject: Re: [charm] [ppl] Unable to run charm++ on infiniband interface
- Date: Wed, 8 Aug 2012 11:13:08 -0700
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
Can you try running with ++scalable-start as an additional argument to charmrun ?
Thanks,
Abhishek
Hi --Recently I tried to run NAMD using charm++ (charmrun) with infiniband support (ibverbs) on our HP Linux cluster running CentOS. I tested both precompiled and my own compiled versions of charmrun. I normally submit the jobs using the following command line:charmrun ++remote-shell ssh ++p 1400 ++verbose ++nodelist \ namd.hostfile namd2 my_job.in
The problem appears shortly after the job starts, which normally ends with charmrun terminating (i.e., NAMD does not even start). Most of the times I get the following error:Charmrun> charmrun started...
Charmrun> using namd.hostfile as nodesfile
Charmrun> remote shell (node0004:0) started
Charmrun> remote shell (node0010:1) started
Charmrun> remote shell (node0020:2) started
...
ERROR> starting rsh: Resource temporarily unavailable
ssh_keysign: fork: Resource temporarily unavailable
ssh_keysign: fork: Resource temporarily unavailable
key_sign failed
ssh_keysign: fork: Resource temporarily unavailable
key_sign failed
ssh_keysign: fork: Resource temporarily unavailable
...
key_sign failed
Permission denied (publickey,keyboard-interactive,hostbased)
This is a recurring error which still appears after adding "CONV_RSH=ssh" to the PBS file or changing user limits (i.e., ulimit -u). I probably got it running only once or twice (out of tens of attempts). Interestingly, I also tried the SMP build which seems to work fine when "++ppn" is added to the command line, although NAMD scales poorly compared to the ibverbs build.
My question is whether the problem could be related to the configuration of our system or I'm missing something that prevents charmrun from initiating properly.Thanks,Michel
_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm
_______________________________________________
ppl mailing list
ppl AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/ppl
- [charm] Unable to run charm++ on infiniband interface, Michel Espinoza-Fonseca, 08/08/2012
- Re: [charm] [ppl] Unable to run charm++ on infiniband interface, Abhishek Gupta, 08/08/2012
- Re: [charm] [ppl] Unable to run charm++ on infiniband interface, Michel Espinoza-Fonseca, 08/08/2012
- Re: [charm] [ppl] Unable to run charm++ on infiniband interface, Phil Miller, 08/08/2012
- Re: [charm] [ppl] Unable to run charm++ on infiniband interface, Michel Espinoza-Fonseca, 08/08/2012
- Re: [charm] [ppl] Unable to run charm++ on infiniband interface, Phil Miller, 08/08/2012
- Re: [charm] [ppl] Unable to run charm++ on infiniband interface, Michel Espinoza-Fonseca, 08/08/2012
- Re: [charm] [ppl] Unable to run charm++ on infiniband interface, Abhishek Gupta, 08/08/2012
Archive powered by MHonArc 2.6.16.