charm AT lists.siebelschool.illinois.edu

Subject: Charm++ parallel programming system

List archive

Re: [charm] CrayPat with Charm++

From: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>
To: Phil Miller <mille121 AT illinois.edu>, Ted Packwood <malice AT cray.com>
Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>, "Choi, Jaemin" <jchoi157 AT illinois.edu>
Subject: Re: [charm] CrayPat with Charm++
Date: Mon, 10 Jul 2017 14:26:48 +0000
Accept-language: en-US

Hey all-

I’m still having trouble profiling the Charm++ examples on Titan with GPUs. Here are two specific cases…

My attempt to run hello program----------------------

COMPILE:

[tpapathe@titan-ext3: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/hello]$ make OPTS="-save"

../../../../bin/charmc -save hello.ci

../../../../bin/charmc -save -c hello.C

/opt/nvidia/cudatoolkit7.5/7.5.18-1.0502.10743.2.1/bin/nvcc -c -use_fast_math -I/usr/local/cuda/include -I../../../../include helloCUDA.cu

../../../../bin/charmc -save -language charm++ -o hello hello.o helloCUDA.o -lcuda -lcudart

INSTRUMENT:

[tpapathe@titan-ext3: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/hello]$ pat_build -u -Dtrace-text-size=800 ./hello

WARNING: Tracing non-group functions was limited to those 803 - 9318 bytes in size.

INFO: A total of 130 selected non-group functions were traced.

RUN UN-INSTRUMENTED PROGRAM FROM INTERACTIVE COMPUTE NODE:

[tpapathe@titan-login5: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/hello]$ aprun -n1 ./hello

Charm++> Running on Gemini (GNI) with 1 processes

Charm++> static SMSG

Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)

Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB

Charm++> Cray TLB page size: 2048K

Charm++> Running in non-SMP mode: numPes 1

Converse/Charm++ Commit ID: v6.8.0-beta1-287-gd57c83d

CharmLB> Load balancer assumes all CPUs are same.

Charm++> Running on 1 unique compute nodes (16-way SMP).

Running Hello on 1 processors for 5 elements

Hello 0 created

Hello 1 created

Hello 2 created

Hello 3 created

Hello 4 created

Hi from element 0

calling kernel

Sending a Hi Message

Hi from element 1

calling kernel

Sending a Hi Message

Hi from element 2

calling kernel

Sending a Hi Message

Hi from element 3

calling kernel

Sending a Hi Message

Hi from element 4

All done

EXIT HYBRID API

[Partition 0][Node 0] End of program

Application 15025878 resources: utime ~0s, stime ~2s, Rss ~1142116, inblocks ~10482, outblocks ~31178

RUN INSTRUMENTED PROGRAM FROM INTERACTIVE COMPUTE NODE:

[tpapathe@titan-login5: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/hello]$ aprun -n1 ./hello+pat

CrayPat/X: Version 6.4.5 Revision 87dd5b8 01/23/17 15:37:24

Charm++> Running on Gemini (GNI) with 1 processes

Charm++> static SMSG

Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)

Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB

Charm++> Cray TLB page size: 2048K

Charm++> Running in non-SMP mode: numPes 1

Converse/Charm++ Commit ID: v6.8.0-beta1-287-gd57c83d

CharmLB> Load balancer assumes all CPUs are same.

Charm++> Running on 1 unique compute nodes (16-way SMP).

pat[WARNING][0]: abort process 11933 because of signal 11

Experiment data file written:

/lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/hello/hello+pat+11933-2351t.xf

_pmiu_daemon(SIGCHLD): [NID 02351] [c6-1c0s7n3] [Mon Jul 10 10:14:49 2017] PE RANK 0 exit signal Segmentation fault

Application 15025897 exit codes: 139

Application 15025897 resources: utime ~0s, stime ~1s, Rss ~137348, inblocks ~11356, outblocks ~32666

If instead, I try to run the vectorAdd program, the un-instrumented code seg faults before even trying the instrumented version:

My attempt to run vectorAdd program----------------------

COMPILE:

[tpapathe@titan-ext3: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/vectorAdd]$ make OPTS="-save"

../../../../bin/charmc -save vectorAdd.ci

../../../../bin/charmc -save -O3 -c vectorAdd.C

/opt/nvidia/cudatoolkit7.5/7.5.18-1.0502.10743.2.1/bin/nvcc -O3 -c -use_fast_math -DGPU_MEMPOOL -DCUDA_USE_CUDAMALLOCHOST -arch=compute_35 -code=sm_35 -I/usr/local/cuda/include -I../../../../../src/arch/cuda/hybridAPI -I../../../../include -o vectorAddCU.o vectorAdd.cu

../../../../bin/charmc -save -language charm++ -o vectorAdd vectorAdd.o vectorAddCU.o –lcublas

INSTRUMENT:

[tpapathe@titan-ext3: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/vectorAdd]$ pat_build -u -Dtrace-text-size=800 ./vectorAdd

WARNING: Tracing non-group functions was limited to those 803 - 9318 bytes in size.

INFO: A total of 130 selected non-group functions were traced.

[tpapathe@titan-ext3: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/vectorAdd]$

RUN UN-INSTRUMENTED PROGRAM FROM INTERACTIVE COMPUTE NODE:

[tpapathe@titan-login5: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/vectorAdd]$ aprun -n1 ./vectorAdd+pat

CrayPat/X: Version 6.4.5 Revision 87dd5b8 01/23/17 15:37:24

Charm++> Running on Gemini (GNI) with 1 processes

Charm++> static SMSG

Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)

Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB

Charm++> Cray TLB page size: 2048K

Charm++> Running in non-SMP mode: numPes 1

Converse/Charm++ Commit ID: v6.8.0-beta1-287-gd57c83d

CharmLB> Load balancer assumes all CPUs are same.

Charm++> Running on 1 unique compute nodes (16-way SMP).

pat[WARNING][0]: abort process 11610 because of signal 11

Experiment data file written:

/lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/vectorAdd/vectorAdd+pat+11610-2351t.xf

_pmiu_daemon(SIGCHLD): [NID 02351] [c6-1c0s7n3] [Mon Jul 10 10:02:29 2017] PE RANK 0 exit signal Segmentation fault

Application 15025560 exit codes: 139

Application 15025560 resources: utime ~0s, stime ~1s, Rss ~138756, inblocks ~11390, outblocks ~32755

[tpapathe@titan-login5: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/vectorAdd]$

Is there something obvious here that I am doing incorrectly? The hello example program appears to work correctly, but the CrayPat profiling on it does not. The vectorAdd example program does not appear to work correctly even without profiling. If you have any further advice, I would greatly appreciate it. Thank you for your help.

-Tom

From: "Choi, Jaemin" <jchoi157 AT illinois.edu>
Date: Friday, July 7, 2017 at 6:12 PM
To: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>
Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Subject: RE: [charm] CrayPat with Charm++

Hi Tom,

First of all thank you for looking into the CrayPat issue in my stead.

To compile the CUDA hello example on Titan, you need to edit the NVCC variable in the Makefile to be $(CUDATOOLKIT_HOME)/bin/nvcc. This is a mistake on our part assuming that NVCC would reside in /usr/local/cuda.

Thank you.

Jaemin Choi

Ph.D. Student, Research Assistant

Parallel Programming Laboratory

University of Illinois Urbana-Champaign

From: Papatheodore, Thomas L. [papatheodore AT ornl.gov]
Sent: Thursday, July 06, 2017 5:27 PM
To: Miller, Philip B; Ted Packwood
Cc: charm AT lists.cs.illinois.edu
Subject: Re: [charm] CrayPat with Charm++

Ok, it now instruments the code (correctly?):

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/tests/charm++/simplearrayhello]$ pat_build -O apa ./hello

INFO: A maximum of 712 functions from group 'mpi' will be traced.

INFO: A maximum of 56 functions from group 'realtime' will be traced.

INFO: A maximum of 199 functions from group 'syscall' will be traced.

But when I try to run it, I get the following:

[tpapathe@titan-batch6: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/tests/charm++/simplearrayhello]$ aprun -n1 ./hello+pat

CrayPat/X: Version 6.4.5 Revision 87dd5b8 01/23/17 15:37:24

pat[WARNING][0]: Collection of accelerator performance data for sampling experiments is not supported. To collect accelerator performance data perform a trace experiment. See the intro_craypat(1) man page on how to perform a trace experiment.

Charm++> Running on Gemini (GNI) with 1 processes

Charm++> static SMSG

Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0 means no limit)

Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB

Charm++> Cray TLB page size: 8192K

Charm++> Running in non-SMP mode: numPes 1

Converse/Charm++ Commit ID: v6.8.0-beta1-287-gd57c83d

CharmLB> Load balancer assumes all CPUs are same.

Charm++> Running on 1 unique compute nodes (16-way SMP).

libhugetlbfs [nid03789:29517]: WARNING: New heap segment map at 0x106bc800000 failed: Cannot allocate memory

…

If I instead compile the cuda version of hello, I cannot find nvcc because it’s looking in the default path where the cuda installation is normally:

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/hello]$ make OPTS="-save"

../../../../bin/charmc -save hello.ci

../../../../bin/charmc -save -c hello.C

/usr/local/cuda/bin/nvcc -c -use_fast_math -I/usr/local/cuda/include -I../../../../include helloCUDA.cu

make: /usr/local/cuda/bin/nvcc: Command not found

make: *** [helloCUDA.o] Error 127

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/examples/charm++/cuda/hello]$

Is there a way to add “-L /opt/nvidia/cudatoolkit7.5/7.5.18-1.0502.10743.2.1/bin/”? I tried adding it to the OPTS=”-save” line but that didn’t work.

From: <unmobile AT gmail.com> on behalf of Phil Miller <mille121 AT illinois.edu>
Date: Thursday, July 6, 2017 at 6:00 PM
To: Ted Packwood <malice AT cray.com>
Cc: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>, "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Subject: Re: [charm] CrayPat with Charm++

You can also run

make OPTS="-save"

and not worry about editing makefiles, etc.

On Thu, Jul 6, 2017 at 4:56 PM, Ted Packwood <malice AT cray.com> wrote:

The Cray document addresses this for NAMD.

Essentially, any .o's created during the build process cannot be removed
or CrayPat will fail (as you noted below).

If you edit the makefile for the test you are running and add -save to CHARMC,
the .o won't be deleted, and pat_build should work.

For example, here's my setting in the Makefile for the jacobi3d test:
CHARMC=../../../mpi-crayxc-craycc-smp/bin/charmc -save

On 07/06/2017 04:52 PM, Papatheodore, Thomas L. wrote:

Ok, that allowed me to build successfully, and I successfully compiled and ran the test problem charm/gni-crayxe-cuda-perftools/tests/charm++/simplearrayhello. However, when I attempt to instrument the program, I get the following error:

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/tests/charm++/simplearrayhello]$ pat_build -O apa hello

ERROR: Can not access the file 'moduleinit31791.o' [No such file or directory].

FATAL: Can not recreate the original program '/lustre/atlas2/csc198/proj-shared/tpapathe/charm/gni-crayxe-cuda-perftools/tests/charm++/simplearrayhello/hello'.

But it’s not complaining about the perftools version anymore!

From: <unmobile AT gmail.com> on behalf of Phil Miller <mille121 AT illinois.edu>
Date: Thursday, July 6, 2017 at 5:26 PM
To: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>
Cc: Ted Packwood <malice AT cray.com>, "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Subject: Re: [charm] CrayPat with Charm++

Hi Tom,

Could you try the suggested environment variable setting from this older mailing list post:

https://lists.cs.illinois.edu/lists/arc/charm/2016-08/msg00004.html

Phil

On Thu, Jul 6, 2017 at 4:18 PM, Papatheodore, Thomas L. <papatheodore AT ornl.gov> wrote:

I am unable to build Charm++ properly on Titan with this new argument. After adding the file with “touch src/arch/common/conv-mach-perftools.h”, the perftools argument is now accepted, but now the build fails. My current module list is as follows:

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$ module list

Currently Loaded Modulefiles:

1) eswrap/1.3.3-1.020200.1278.0 6) craype-interlagos 11) hsi/5.0.2.p1 16) ugni/6.0-1.0502.10863.8.28.gem 21) dvs/2.5_0.9.0-1.0502.2188.1.113.gem 26) perftools-base/6.4.5

2) craype-network-gemini 7) lustredu/1.4 12) DefApps 17) pmi/5.0.11 22) alps/5.2.4-2.0502.9774.31.12.gem 27) perftools

3) gcc/4.9.3 8) xalt/0.7.5 13) craype-hugepages8M 18) dmapp/7.0.1-1.0502.11080.8.74.gem 23) rca/1.0.0-2.0502.60530.1.63.gem 28) cudatoolkit/7.5.18-1.0502.10743.2.1

4) craype/2.5.9 9) module_msg/0.1 14) cray-libsci/16.11.1 19) gni-headers/4.0-1.0502.10859.7.8.gem 24) atp/2.0.5

5) cray-mpich/7.5.2 10) modulator/1.2.0 15) udreg/2.3.2-1.0502.10518.2.17.gem 20) xpmem/0.1-2.0502.64982.5.3.gem 25) PrgEnv-gnu/5.2.82

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$

As per the specific Cray XK7 installation instructions in the Charm++ documentation, I issued the build command (with the perftools argument added):

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$ ./build charm++ gni-crayxe-cuda perftools --with-production -j8

However, the build fails (I attached the build output). Should I not be using the latest development version (6.8.0) and instead use v6.7.1 in the manner described below? Thanks.

-Tom

From: Ted Packwood <malice AT cray.com>
Date: Thursday, July 6, 2017 at 1:47 PM
To: Phil Miller <mille121 AT illinois.edu>
Cc: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>, "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>

Subject: Re: [charm] CrayPat with Charm++

Please note the document your referenced in your first email says:

"Building for Perftools
Do not load perftools or perftools-lite when building Charm++ (perftools-base can
be safely loaded prior to building). Instead, ensure frame pointers are not omitted in
your conv-mach.sh file (this is the new default for Cray XC builds). These options are
provided in the new conv-mach.sh file submitted to the Charm++ developer"

This document was intended for users wanting to build charm++ 6.7.1 (the document
was published prior to the addition of perftools support being added into Charm++
develop). The line "ensure frame pointers are not omitted in your conv-mach.sh file"
is key.

If using v6.7.1, you can do this yourself by adding the following lines to your
conv-mach.sh file. This will support any PrgEnv on your machine (though PGI
has not been tested).

BUILD_FOR_PERFTOOLS=1   # Comment out if desired

# If building for Cray perftools, use the following.
#   Do not have perftools or perftools-lite loaded during
#     the build of charm++ (perftools-base is ok)
if test -n "$BUILD_FOR_PERFTOOLS"
then
CRAYPAT_FLAGS=" "
if test -n "$PGCC"
then
    CRAYPAT_FLAGS=" "
elif test -n "$CCE"
then
    CRAYPAT_FLAGS=" -hkeep_frame_pointer"
elif test -n "$ICPC"
then
    CRAYPAT_FLAGS=" -fno-omit-frame-pointer"
else   # gcc
    CRAYPAT_FLAGS=" -fno-omit-frame-pointer"
fi
fi

Then change these lines in the same file:
CMK_CC="cc $CRAYPAT_FLAGS "
CMK_CXX="CC $CRAYPAT_FLAGS "

The process (v6.7.1) therefore is:
1) Add the compiler flags to your conv-mach.sh file above
2) Build charm++ as normal, *without* perftools or perftools-lite loaded
3) Load perftools or perftools-lite
4) Build and link your example program.
5) pat_build your program (if using perftools). (skip this step if using perftools-lite).
6) Run as normal.

I'll work with Phil to ensure that we get all this straightened out in 6.8.0.

If you'd like, I can work with you directly if you have any additional problems.
Ted

On 07/06/2017 11:42 AM, Phil Miller wrote:

Sorry, I messed up the path to touch. It should have been `src/arch/common/conv-mach-perftools.h`

On Thu, Jul 6, 2017 at 11:29 AM, Papatheodore, Thomas L. <papatheodore AT ornl.gov> wrote:

I created the file with “touch”, but I still get the same error:

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$ touch src/arch/conv-mach-perftools.h

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$ ./build charm++ gni-crayxc smp perftools --with-production -g

Error> option: perftools is not supported in this version!

Supported compilers: clang craycc gcc icc iccstatic pgcc xlc xlc64

Supported options: bigemulator bigsim causalft cuda g95 gfortran hugepages ifort lcs mlogft nolb omp ooc papi persistent pgf90 pxshm regularpages smp syncft tsan xpmem

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$ ls src/arch/

common gni mpi-crayxc mpi-linux-mips64 multicore multicore-linux64 net net-linux-ppc netlrts-linux-arm7 pami-bluegeneq shmem-crayxe uth-linux verbs-linux-x86_64

conv-mach-fix.sh gni-crayxc mpi-crayxe mpi-linux-ppc multicore-arm7 multicore-linux-ppc net-darwin-x86_64 net-linux-x86_64 netlrts-linux-ppc pami-linux-ppc64le sim uth-linux-x86_64 win

conv-mach-perftools.h gni-crayxe mpi-darwin-x86_64 mpi-linux-x86_64 multicore-darwin-x86_64 multicore-linux-x86_64 net-linux netlrts netlrts-linux-x86_64 pamilrts sim-linux util win32

cuda mpi mpi-linux mpi-win64 multicore-linux multicore-win64 net-linux-amd64 netlrts-darwin-x86_64 netlrts-win-x86_64 pamilrts-bluegeneq template verbs win64

gemini_gni-crayxe mpi-bluegeneq mpi-linux-amd64 mpi-win-x86_64 multicore-linux32 multicore-win-x86_64 net-linux-arm7 netlrts-linux pami shmem uth verbs-linux-ppc64le

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$

Please let me know if there is something else I need to do. Thanks again.

-Tom

From: <unmobile AT gmail.com> on behalf of Phil Miller <mille121 AT illinois.edu>
Date: Thursday, July 6, 2017 at 12:18 PM

To: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>
Cc: Ted Packwood <malice AT cray.com>, "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Subject: Re: [charm] CrayPat with Charm++

On Thu, Jul 6, 2017 at 11:06 AM, Papatheodore, Thomas L. <papatheodore AT ornl.gov> wrote:

I grabbed the latest development version from github (git clone https://charm.cs.illinois.edu/gerrit/charm.git) as described on the charm++ downloads page.

The CHANGES file states it is version 6.8.0: What's new in Charm++ 6.8.0

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$ git describe

v6.8.0-beta1-287-gd57c83d

OK, I see. In the future, we should label the contents of the CHANGES file tentatively until a final release is tagged and announced.

There is not a file called “conv-mach-perftools.h” on src/arch/, but there is a “conv-mach-fix.sh”.

Indeed there isn't. The intention of the command I asked you to run is to create it - our build script looks for both a header snippet and a shell snippet for build-time options. In this case, the header snippet was missing, and so it thought the option was invalid.

From: <unmobile AT gmail.com> on behalf of Phil Miller <mille121 AT illinois.edu>
Date: Thursday, July 6, 2017 at 11:30 AM
To: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>
Cc: Ted Packwood <malice AT cray.com>, "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>

Subject: Re: [charm] CrayPat with Charm++

On Thu, Jul 6, 2017 at 10:20 AM, Papatheodore, Thomas L. <papatheodore AT ornl.gov> wrote:

Hey Phil-

Wow! Thanks for the quick and thorough response! I’m using version 6.8.0, but when I add the “perftools” argument I get:

Uhhh... Version 6.8.0 hasn't been released yet. We announced a beta of it some months ago, but not the final release. Could you share the name of the file containing the code calling itself Charm++ 6.8.0, and output of 'git describe' in the Charm++ directory?

[tpapathe@titan-ext2: /lustre/atlas2/csc198/proj-shared/tpapathe/charm]$ ./build charm++ gni-crayxc smp perftools --with-production -g

Error> option: perftools is not supported in this version!

Supported compilers: clang craycc gcc icc iccstatic pgcc xlc xlc64

Supported options: bigemulator bigsim causalft cuda g95 gfortran hugepages ifort lcs mlogft nolb omp ooc papi persistent pgf90 pxshm regularpages smp syncft tsan xpmem

Is there a version yet that supports this argument? Thanks.

This seems to be the result of a mistake on our part. In the charm++ directory, please run

touch src/arch/conv-mach-perftools.h

And then try the build. If that works, please let us know.

-Tom

From: <unmobile AT gmail.com> on behalf of Phil Miller <mille121 AT illinois.edu>
Date: Thursday, July 6, 2017 at 10:18 AM
To: "Papatheodore, Thomas L." <papatheodore AT ornl.gov>, Ted Packwood <malice AT cray.com>
Cc: "charm AT lists.cs.illinois.edu" <charm AT lists.cs.illinois.edu>
Subject: Re: [charm] CrayPat with Charm++

Hi Tom,

What version of Charm++ did you use in your attempt(s)? As noted in the linked document, the changes in Charm++ to support CrayPat are in our development line, but have not quite yet been released.

Our compiler wrapper 'charmc' has always called through the Cray compiler wrappers on Cray systems. If some component is missing or not working correctly, the problem lies elsewhere.

Looking through that document, there seems to be some skew introduced between the patch to Charm++ as Cray provided it, and what we ultimately integrated. On your Charm++ 'build' command line, you'll need to add the 'perftools' option like so to get all of the necessary compiler settings:

./build charm++ gni-crayxc smp perftools --with-production -g

This difference was introduced (I believe) out of concern for a potential negative performance impact of always applying the flags in question, since they reduce the number of CPU registers available to the compiler.

Also note the lack of 'craycc' option on that line - via Cray's CC, charmc will always use the compiler selected by the loaded PrgEnv-* module.

There may be some other discrepancies between the documentation provided by Cray and the current state of the various software components. Feel free to get back to us with further issues. In the meanwhile, I'll have someone on the Charm++ team walk through the entire process and see what other updates might need to be made.

--

Phil Miller

CTO, Charmworks Inc

On Thu, Jul 6, 2017 at 8:47 AM, Papatheodore, Thomas L. <papatheodore AT ornl.gov> wrote:

Hello. I am having trouble running CrayPat on a Charm++ example problem on Titan (Cray XK7) at ORNL. The error message that I encounter when trying to instrument the executable with pat_build is:

ERROR: Missing required ELF section '.note.link' from the program '/lustre/atlas2/csc198/proj-shared/tpapathe/test/charm-cuda/multicore-linux-x86_64-cuda/tests/charm++/simplearrayhello/hello'. Load the correct 'perftools' module and rebuild the program.

This error is usually caused by not having the perftools-base and perftools modules loaded before compiling a program. However, I compiled the program with these modules loaded and I still receive the message. I also recompiled Charm++ itself with these modules loaded, but still get the message. Is there anything special I need to do to get CrayPat to profile Charm++ applications? I found the Cray document (http://docs.cray.com/books/S-2802-10//S-2802-10.pdf), but this does not resolve my issue. Could it be that Charm++ compiles with charmc and CrayPat requires the Cray compiler wrappers (e.g. CC, ftn, etc.)?

Any help would be appreciated. Thanks.

-Tom

Re: [charm] CrayPat with Charm++, (continued)