charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Bilge Acun <acun2 AT illinois.edu>
- To: "Bohm, Eric J" <ebohm AT illinois.edu>
- Cc: "charm AT cs.uiuc.edu" <charm AT cs.uiuc.edu>
- Subject: Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster
- Date: Tue, 17 Feb 2015 15:07:37 -0600
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm/>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
I don't think this is documented anywhere, I'll add it to the manual.
I can extend the abort print statement to state that QLOGIC macro needs to be enabled for Qlogic hardware if that's helpful.
On 17 February 2015 at 14:55, Bohm, Eric J <ebohm AT illinois.edu> wrote:
We have not found a reliable way to detect this at build time.
On 02/17/2015 02:40 PM, Jim Phillips wrote:
>
> Is this documented anywhere? Is there a way to detect this at runtime?
>
> Jim
>
>
> On Tue, 17 Feb 2015, Bilge Acun wrote:
>
>> Hi Jozef,
>>
>> For Qlogic hardware, QLOGIC macro needs to be enabled when building
>> Charm++.
>> Can you try building Charm++ again with adding -DQLOGIC option?
>>
>> Thanks,
>>
>> --
>>
>> *Bilge Acun*
>> *PhD Candidate at University of Illinois at Urbana-Champaign*
>> *Computer Science Department*
>> *Bilge Acun*>>
>> On 17 February 2015 at 09:33, Jozsef Bakosi <jbakosi AT gmail.com> wrote:
>>
>>> Thanks, Jim and Abhinav, this helps. However, this is what I get after
>>> building Charm++ with "net-linux-x86_64 ibverbs" and trying to
>>> run simplearrayhello:
>>>
>>> $ ./charmrun +p32 ./hello ++mpiexec
>>> Charmrun> IBVERBS version of charmrun
>>> Charmrun> started all node programs in 2.129 seconds.
>>> ------------- Processor 0 Exiting: Called CmiAbort ------------
>>> Reason: Failed to change qp state to RTS: you may need some
>>> device-specific parameters in machine-ibevrbs
>>> ...
>>> (32x)
>>> ...
>>> [0] Stack Traceback:
>>> [0:0] CmiAbort+0x40 [0x54bac0]
>>> [0:1] initInfiOtherNodeData+0x168 [0x54bfd8]
>>> [0:2] ConverseInit+0xe8a [0x5569fa]
>>> [0:3] main+0x26 [0x4857e6]
>>> [0:4] __libc_start_main+0xfd [0x2abbb434cd5d]
>>> [0:5] [0x47ffd9]
>>> Fatal error on PE 0> Failed to change qp state to RTS: you may need
>>> some
>>> device-specific parameters in machine-ibevrbs
>>>
>>> And here is what I get after building with "net-linux-x86_64 ibverbs
>>> smp":
>>>
>>> $ ./charmrun +p32 ./hello ++mpiexec
>>> Charmrun> IBVERBS version of charmrun
>>> Charmrun> started all node programs in 0.856 seconds.
>>> Charmrun: error on request socket--
>>> Socket closed before recv.
>>>
>>> Any other clue as to what I'm still missing?
>>>
>>> Thanks,
>>> Jozsef
>>>
>>> On Mon, Feb 16, 2015 at 8:57 PM, Abhinav Bhatele <
>>> bhatele AT illinoisalumni.org> wrote:
>>>
>>>> Hi Jozsef,
>>>>
>>>> Please find some answers inline:
>>>>
>>>>
>>>> On Fri, Feb 13, 2015 at 8:19 AM, Jozsef Bakosi <jbakosi AT gmail.com>
>>>> wrote:
>>>>
>>>>> Hi folks,
>>>>>
>>>>> I'm wondering what is the best way to run Charm++ applications on
>>>>> clusters with Infiniband interconnects. So far I've been successfully
>>>>> building and running my app using Charm++, built by the following
>>>>> command,
>>>>> which uses MPI:
>>>>>
>>>>> ./build AMPI mpi-linux-x86_64 mpicxx
>>>>>
>>>>> But now I'm wondering if the "ibverbs" build option provides better
>>>>> performance on Infiniband clusters. We have Qlogic and Mellanox
>>>>> Infiniband
>>>>> Fat-Tree interconnets. To experiment with this, I have
>>>>> successfully built
>>>>> Charm++ using the following command:
>>>>>
>>>>> ./build AMPI net-linux-x86_64 ibverbs
>>>>>
>>>>> But when I try to
>>>>> run net-linux-x86_64-ibverbs/tests/charm++/simplearrayhello on two
>>>>> compute
>>>>> nodes, I get
>>>>>
>>>>> $ ./charmrun +p32 ./hello
>>>>> Charmrun> IBVERBS version of charmrun
>>>>> mcmd: connect failed: Connection refused (32x)
>>>>> Charmrun> Error 1 returned from rsh (localhost:0)
>>>>>
>>>>> So my questions are:
>>>>>
>>>>> 1. Can I expect better performance on Infiniband clusters using
>>>>> build
>>>>> options other than MPI?
>>>>>
>>>>
>>>> Yes, typically you would expect the ibverbs build to perform better
>>>> than the MPI build. You can try the four builds below:
>>>>
>>>> mpi-linux-x86_64 mpicxx
>>>> mpi-linux-x86_64 mpicxx smp
>>>>
>>>> net-linux-x86_64 ibverbs
>>>> net-linux-x86_64 ibverbs smp
>>>>
>>>>
>>>>> 2. Do I also have to contact our system admins to allow access to
>>>>> lower (than MPI) level software layers for the interconnect so
>>>>> Charm++ code
>>>>> (I assume ibverbs) can use it?
>>>>>
>>>>
>>>> No, like Jim pointed out you can use ++mpiexec or manually specify
>>>> the
>>>> nodelist that has been allocated to you:
>>>> http://charm.cs.illinois.edu/manuals/html/charm++/C.html
>>>>
>>>>
>>>>> 3. Am I missing something else?
>>>>> 4. Are the best ways to build Charm++ for specific hardware
>>>>> documented
>>>>> somewhere?
>>>>>
>>>>
>>>> Hopefully, someone else will answer this but my guess is no.
>>>>
>>>>
>>>>>
>>>>> Thanks in advance, and please let me know if I you need more
>>>>> information on the clusters.
>>>>> Jozsef
>>>>>
>>>>> _______________________________________________
>>>>> charm mailing list
>>>>> charm AT cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/charm
>>>>>
>>>>> _______________________________________________
>>>>> ppl mailing list
>>>>> ppl AT cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/ppl
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Abhinav Bhatele, people.llnl.gov/bhatele
>>>> Center for Applied Scientific Computing, Lawrence Livermore National
>>>> Laboratory
>>>>
>>>
>>>
>>
>>
>> --
>>
>> *PhD Candidate at University of Illinois at Urbana-Champaign*
>> *Computer Science Department*
>>
> _______________________________________________
> charm mailing list
> charm AT cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/charm
_______________________________________________
charm mailing list
charm AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/charm
Bilge Acun
PhD Candidate at University of Illinois at Urbana-Champaign
Computer Science Department
- [charm] Best way to run Charm++ apps on an Infiniband cluster, Jozsef Bakosi, 02/13/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Jim Phillips, 02/16/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Abhinav Bhatele, 02/16/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Jozsef Bakosi, 02/17/2015
- Message not available
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Bilge Acun, 02/17/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Jim Phillips, 02/17/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Eric Bohm, 02/17/2015
- Message not available
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Bilge Acun, 02/17/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Bilge Acun, 02/17/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Jozsef Bakosi, 02/17/2015
- Message not available
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Bilge Acun, 02/17/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Jim Phillips, 02/17/2015
- Re: [charm] [ppl] Best way to run Charm++ apps on an Infiniband cluster, Bilge Acun, 02/17/2015
Archive powered by MHonArc 2.6.16.