charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: "Ortega, Bob" <bobo AT mail.smu.edu>
- To: Ronak Buch <rabuch2 AT illinois.edu>
- Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
- Subject: Re: [charm] FW: Projections
- Date: Mon, 14 Dec 2020 13:56:24 +0000
- Accept-language: en-US
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mail.smu.edu; dmarc=pass action=none header.from=mail.smu.edu; dkim=pass header.d=mail.smu.edu; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1yzPFva0uFIvP90TA6fmLfWW4INxZiZ7gOL10GYf56M=; b=SSqcm3WLmeVzak+fb0N+b9rZhdtxfvoKvHb3mmgakxXUSPjGkHZ6XW2opLAcZbWETglM274SqktIr20ZltmmVHDOdgZxMAC96S/yLY5b2p6pxlc1/d9ISYuxbkFciedXwz+gbM5Wiayl0g8GAiLbEESDxqrIl4P41MqQvE+/Jw7gRMyuzsLsP5QaXOeAnI0W/8gMxfieIo/nZ03pwELRRXRS9naCnB1geKo2vhDMk5N12t0+7XM9CBBwJyc46PKEVyFxLDPvMMg+Y9q6ZP+ZA1nNvNwJ9UPZEvGra94CWVr9mHIJg9NFXgWN80xI3qRu79GIMoqFVg0lDiUDf0hhqw==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cKSjgNbzQJpL5OWzVq5iwWW4TWVGAQJrWSUOi7VFdbblPMDKpuBX+AAIjJ8MiBvWUC2bgZV35tU8lSouMFdA8vOlOtJnR3Zd1wVk+yLwnP/UIIu2kPl3rnj3mlvU92eV7qXC3ebaFizsAbSrYPoIbefXUnU2zcXBfAe+8BgUFbqceE43wQxSXVO+PaquX6sE6ssW9rtqftTy000phUq9+F6ENuvxKz3aAGBKKN3bZyGKrFLQwZKr8tvYXGw2RkYeSLreYSeLlrLYjSrfo+zjfoTmp4N8w6yyupeJHNy9FrzMhq9eil+A1Jp8wox8KWyKG3zykgy+oQRD78lf4jLcPQ==
- Authentication-results: illinois.edu; spf=pass smtp.mailfrom=bobo AT mail.smu.edu; dkim=pass header.d=smu365.onmicrosoft.com header.s=selector2-smu365-onmicrosoft-com; dmarc=none
Ronak,
I apologize. I’ve decided to not include or attach a screenshot from Projections run because SYMPA keeps telling me it cannot distribute the message with the screenshot.
Sorry if you’ve received this messages multiple times. Usually I get a confirmation from the mail list server but then realized I had not copied the list server. Just wanted to make sure you received this. Sometimes when I included a graphic it’s too large. So, I resized it. I tried before and it was above 400 kb. Now, I resized again to belows 400 kb.
*********************************************************************************************** That worked! No warning messages.
We are attempting to be able to confirm parallel running of NAMD/Charm as it has been a long standing issue. There is a serial version running, but of course, because we do have the ability to run Applications in parallel, that is what this latest testing is about. This why I have been seeking tools/resources to better understand what is going on during these runs.
As I mentioned to Nitin, I would really like to understand better (and where it indicates things are running in parallel) the output from NAMD, which starts off like this:
Charm++> Running on MPI version: 3.1 Charm++> level of thread support used: MPI_THREAD_SINGLE (desired: MPI_THREAD_SINGLE) Charm++> Running in non-SMP mode: 36 processes (PEs) Charm++> Using recursive bisection (scheme 3) for topology aware partitions Converse/Charm++ Commit ID: v6.10.2-0-g7bf00fa-namd-charm-6.10.2-build-2020-Aug-05-556 Trace: logsize: 10000000 Charm++: Tracemode Projections enabled. Trace: traceroot: /users/bobo/NAMD/NAMD_2.14_Source/Linux-x86_64-icc/./namd2.prj CharmLB> Load balancer assumes all CPUs are same. Charm++> Running on 2 hosts (2 sockets x 9 cores x 1 PUs = 18-way SMP) Charm++> cpu topology info is gathered in 0.024 seconds. Info: NAMD 2.14 for Linux-x86_64-MPI Info: Info: Please visit http://www.ks.uiuc.edu/Research/namd/ Info: for updates, documentation, and support information. Info: Info: Please cite Phillips et al., J. Chem. Phys. 153:044130 (2020) doi:10.1063/5.0014475 Info: in all publications reporting results obtained with NAMD. Info: Info: Based on Charm++/Converse 61002 for mpi-linux-x86_64-icc Info: Built Wed Dec 9 22:01:36 CST 2020 by bobo on login04 Info: 1 NAMD 2.14 Linux-x86_64-MPI 36 v001 bobo Info: Running on 36 processors, 36 nodes, 2 physical nodes. Info: CPU topology information available. Info: Charm++/Converse parallel runtime startup completed at 0.769882 s Info: 2118.93 MB of memory in use based on /proc/self/stat Info: Configuration file is stmv/stmv.namd Info: Changed directory to stmv TCL: Suspending until startup complete. Info: SIMULATION PARAMETERS: Info: TIMESTEP 1 Info: NUMBER OF STEPS 500 Info: STEPS PER CYCLE 20 Info: PERIODIC CELL BASIS 1 216.832 0 0 Info: PERIODIC CELL BASIS 2 0 216.832 0
So, in addition to learning more about projections, what other tools/apps/resources would you recommend that might help in monitoring/analyzing our attempts at parallization?
Thanks! Bob
From: Ronak Buch <rabuch2 AT illinois.edu>
Hi Bob,
Your run command should look something like:
date;time srun -n 36 -N 2 -p fp-gpgpu-3 --mem=36GB ./namd2.prj stmv/stmv.namd +logsize 10000000 >namd2.prj.fp-gpgpu-3.6.log;date
Thanks, Ronak
On Thu, Dec 10, 2020 at 3:31 PM Ortega, Bob <bobo AT mail.smu.edu> wrote:
|
- [charm] FW: Projections, Ortega, Bob, 12/10/2020
- <Possible follow-up(s)>
- Re: [charm] FW: Projections, Ronak Buch, 12/10/2020
- Re: [charm] FW: Projections, Ortega, Bob, 12/10/2020
- Message not available
- Re: [charm] FW: Projections, Ronak Buch, 12/11/2020
- Message not available
- Message not available
- Message not available
- Message not available
- Message not available
- Re: [charm] FW: Projections, Ortega, Bob, 12/14/2020
- Message not available
- Message not available
- Message not available
- Re: [charm] FW: Projections, Ronak Buch, 12/11/2020
- Message not available
- Message not available
- Message not available
- Message not available
- Message not available
- Message not available
- Re: [charm] FW: Projections, Ronak Buch, 12/14/2020
- Re: [charm] FW: Projections, Ortega, Bob, 12/14/2020
- Message not available
- Message not available
- Message not available
- [charm] FW: FW: Projections, Ortega, Bob, 12/15/2020
- Message not available
- Message not available
Archive powered by MHonArc 2.6.19.