charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Xuehan Xu <xxhdx1985126 AT gmail.com>
- To: charm AT cs.uiuc.edu
- Subject: [charm] Fwd: Question about running BigNetSim with "+wth4"
- Date: Sun, 16 Oct 2011 15:11:48 +0800
- List-archive: <http://lists.cs.uiuc.edu/pipermail/charm>
- List-id: CHARM parallel programming system <charm.cs.uiuc.edu>
I did the following modification to the file "netconfig"
NUM_NODES 8
DIMENSIONS 2 2 2
and the output of BigNetSim beco like this:
[couple@node70 BlueGene]$ ../tmp/charmrun +p1 ../tmp/bigsimulator 1 0 ++remote-shell ssh
Charmrun> started all node programs in 1.174 seconds.
Converse/Charm++ Commit ID: v6.3.0-626-gf074431
Charm++> scheduler running in netpoll mode.
Charm++> Running on 1 unique compute nodes (2-way SMP).
Charm++> cpu topology info is gathered in 0.000 seconds.
================= Simulation Configuration =================
Number of physical PEs: 1
POSE mode: Parallel
Network model: BlueGene
Command line: /home/couple/NewCharm/BigNetSim/trunk/BlueGene/../tmp/bigsimulator 1 0
Timing factor: 1.000000e+08 (i.e., 1 GVT tick = 10 ns)
cpufactor: 1.000000
bgTrace summary: totalBGProcs=32 X=2 Y=2 Z=2 #CommThreads=1 #WorkerThreads=4 #PEs=1 LogVersion=6
Simulation mode: trace driven
Simulation network mode: full contention
Initializing POSE...
POSE initialization complete.
Using Inactivity Detection for termination.
Network parameters:
Max packet size: 256
File window size: 0
Debug print level: 0
Window load threshold: 0
Intra node latency: 0.500000 us
Intra node bandwidth: 1.000000 GB/s
Number of buffers per port in each switch: 12
Switch buffer size: 1024
Channel bandwidth: 1.000000
Channel delay: 0
Link stats collection interval: 1000000 GVT ticks
Link stats on: no
Message stats on: no
Adaptive routing on: yes
Header size: 16 bytes
Processor send overhead: 0 GVT ticks
Processor receive overhead: 0 GVT ticks
Number of simulated nodes: 8
============================================================
Info> invoking startup task from proc 0 ...
Info> Starting at the beginning of the simulation
Info> Running to the end of the simulation
WARNING: TASK NOT FOUND src:0 msg:4 on:24
WARNING: TASK NOT FOUND src:0 msg:4 on:24
WARNING: TASK NOT FOUND src:0 msg:4 on:26
WARNING: TASK NOT FOUND src:0 msg:4 on:26
WARNING: TASK NOT FOUND src:0 msg:4 on:27
WARNING: TASK NOT FOUND src:0 msg:4 on:27
WARNING: TASK NOT FOUND src:0 msg:4 on:28
WARNING: TASK NOT FOUND src:0 msg:4 on:28
WARNING: TASK NOT FOUND src:0 msg:4 on:30
WARNING: TASK NOT FOUND src:0 msg:4 on:30
WARNING: TASK NOT FOUND src:0 msg:4 on:31
WARNING: TASK NOT FOUND src:0 msg:4 on:31
WARNING: TASK NOT FOUND src:0 msg:4 on:8
WARNING: TASK NOT FOUND src:0 msg:4 on:8
WARNING: TASK NOT FOUND src:0 msg:4 on:10
WARNING: TASK NOT FOUND src:0 msg:4 on:10
.........
273407 0 Something wrong src 1 dst 0 msgid 1
273407 0 message was not stored in advance273543 0 Something wrong src 1 dst 0 msgid 1
273543 0 message was not stored in advance273679 0 Something wrong src 1 dst 0 msgid 1
273679 0 message was not stored in advance272863 0 Something wrong src 2 dst 0 msgid 2
272863 0 message was not stored in advance273135 0 Something wrong src 2 dst 0 msgid 2
273135 0 message was not stored in advance273271 0 Something wrong src 2 dst 0 msgid 2
273271 0 message was not stored in advance273951 0 Something wrong src 3 dst 0 msgid 3
273951 0 message was not stored in advance274087 0 Something wrong src 3 dst 0 msgid 3
274087 0 message was not stored in advance274495 0 Something wrong src 3 dst 0 msgid 3
274495 0 message was not stored in advance274223 0 Something wrong src 4 dst 0 msgid 4
274223 0 message was not stored in advance274359 0 Something wrong src 4 dst 0 msgid 4
274359 0 message was not stored in advance274903 0 Something wrong src 4 dst 0 msgid 4
274903 0 message was not stored in advance274767 0 Something wrong src 5 dst 0 msgid 5
274767 0 message was not stored in advance275583 0 Something wrong src 5 dst 0 msgid 5
275583 0 message was not stored in advance275719 0 Something wrong src 5 dst 0 msgid 5
275719 0 message was not stored in advance275175 0 Something wrong src 6 dst 0 msgid 6
275175 0 message was not stored in advance275311 0 Something wrong src 6 dst 0 msgid 6
275311 0 message was not stored in advance276263 0 Something wrong src 6 dst 0 msgid 6
276263 0 message was not stored in advance275855 0 Something wrong src 7 dst 0 msgid 7
275855 0 message was not stored in advance275991 0 Something wrong src 7 dst 0 msgid 7
275991 0 message was not stored in advance276127 0 Something wrong src 7 dst 0 msgid 7
276127 0 message was not stored in advance281472 0 Something wrong src 4 dst 0 msgid 4
281472 0 message was not stored in advance282024 0 Something wrong src 4 dst 0 msgid 4
282024 0 message was not stored in advance282208 0 Something wrong src 5 dst 0 msgid 5
282208 0 message was not stored in advance282576 0 Something wrong src 5 dst 0 msgid 5
282576 0 message was not stored in advance283128 0 Something wrong src 6 dst 0 msgid 6
283128 0 message was not stored in advance283312 0 Something wrong src 6 dst 0 msgid 6
283312 0 message was not stored in advance283864 0 Something wrong src 7 dst 0 msgid 7
283864 0 message was not stored in advance284048 0 Something wrong src 7 dst 0 msgid 7
284048 0 message was not stored in advance280920 0 Something wrong src 3 dst 0 msgid 3
280920 0 message was not stored in advance281104 0 Something wrong src 3 dst 0 msgid 3
281104 0 message was not stored in advance281656 0 Something wrong src 3 dst 0 msgid 3
281656 0 message was not stored in advance279632 0 Something wrong src 1 dst 0 msgid 1
279632 0 message was not stored in advance280184 0 Something wrong src 1 dst 0 msgid 1
280184 0 message was not stored in advance280552 0 Something wrong src 1 dst 0 msgid 1
280552 0 message was not stored in advance279816 0 Something wrong src 2 dst 0 msgid 2
279816 0 message was not stored in advance280368 0 Something wrong src 2 dst 0 msgid 2
280368 0 message was not stored in advance280736 0 Something wrong src 2 dst 0 msgid 2
280736 0 message was not stored in advanceSimulation inactive at time: 2776774
Final GVT = 2776774
282392 0 Something wrong src 4 dst 0 msgid 4
282392 0 message was not stored in advance282760 0 Something wrong src 5 dst 0 msgid 5
282760 0 message was not stored in advance283680 0 Something wrong src 6 dst 0 msgid 6
283680 0 message was not stored in advance284232 0 Something wrong src 7 dst 0 msgid 7
284232 0 message was not stored in advanceFinal basic stats: Commits: 8685 Rollbacks: 3
Final basic stats: GVT iterations: 2237
1 PE Simulation finished at 1.101722.
Is this due to my misconfiguration of netconfig?
---------- Forwarded message ----------
From: Xuehan Xu <xxhdx1985126 AT gmail.com>
Date: 16 October 2011 14:45
Subject: Question about running BigNetSim with "+wth4"
To: charm AT cs.uiuc.edu
Dear Sirs:
I tried to simulate the Cjacobi3D program with the parameter "+wth4", but some assertion error occured when running BigNetSim.
I used the emulator to run the program with "+wth4" like the following:
./charmrun +p1 ./jacobi 4 4 2 +x2 +y2 +z2 +wth4 ++remote-shell ssh +bglog
Then I moved the traces to BigNetSim/trunk/BlueGene/ and ran the BigNetSim:
[couple@node70 BlueGene]$ ../tmp/charmrun +p1 ../tmp/bigsimulator 1 0 ++remote-shell ssh
Charmrun> started all node programs in 2.110 seconds.
Converse/Charm++ Commit ID: v6.3.0-626-gf074431
Charm++> scheduler running in netpoll mode.
Charm++> Running on 1 unique compute nodes (2-way SMP).
Charm++> cpu topology info is gathered in 0.001 seconds.
================= Simulation Configuration =================
Number of physical PEs: 1
POSE mode: Parallel
Network model: BlueGene
Command line: /home/couple/NewCharm/BigNetSim/trunk/BlueGene/../tmp/bigsimulator 1 0
Timing factor: 1.000000e+08 (i.e., 1 GVT tick = 10 ns)
cpufactor: 1.000000
bgTrace summary: totalBGProcs=32 X=2 Y=2 Z=2 #CommThreads=1 #WorkerThreads=4 #PEs=1 LogVersion=6
Simulation mode: trace driven
Simulation network mode: full contention
Initializing POSE...
POSE initialization complete.
Using Inactivity Detection for termination.
Network parameters:
Max packet size: 256
File window size: 0
Debug print level: 0
Window load threshold: 0
Intra node latency: 0.500000 us
Intra node bandwidth: 1.000000 GB/s
Number of buffers per port in each switch: 12
Switch buffer size: 1024
Channel bandwidth: 1.000000
Channel delay: 0
Link stats collection interval: 1000000 GVT ticks
Link stats on: no
Message stats on: no
Adaptive routing on: yes
Header size: 16 bytes
Processor send overhead: 0 GVT ticks
Processor receive overhead: 0 GVT ticks
Number of simulated nodes: 8
============================================================
Info> invoking startup task from proc 0 ...
Info> Starting at the beginning of the simulation
Info> Running to the end of the simulation
[0] Assertion "inPort == numP" failed in file modDirectionOrderedNDTorus.C line 29.
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason:
[0] Stack Traceback:
[0:0] CmiAbort+0x75 [0x82b7130]
[0:1] __cmi_assert+0x3c [0x82bfdb7]
[0:2] _ZN26modDirectionOrderedNDTorus11selectRouteEiiiP8TopologyP6PacketRSt3mapIiiSt4lessIiESaISt4pairIKiiEEESC_Pt+0x187 [0x818b56d]
[0:3] _ZN10SwitchBase10recvPacketEP6Packet+0x5d1 [0x8160ce9]
[0:4] _ZN12state_Switch10recvPacketEP6Packet+0x24 [0x8161134]
[0:5] _ZN6Switch9ResolveFnEiPv+0xf1 [0x815f2b9]
[0:6] _ZN6adapt44StepEv+0x34b [0x81dc453]
[0:7] _ZN3sim4StepEv+0xc0 [0x81d7f22]
[0:8] _ZN6Switch10recvPacketEP6Packet+0x1e3 [0x81610cb]
[0:9] _ZN14CkIndex_Switch23_call_recvPacket_PacketEPvP6Switch+0x18 [0x815d28c]
[0:10] CkDeliverMessageFree+0x44 [0x822bc6d]
[0:11] _ZN14CkLocRec_local11invokeEntryEP12CkMigratablePvib+0x13a [0x8243246]
[0:12] _ZN14CkLocRec_local7deliverEP14CkArrayMessage11CkDeliver_ti+0x1bc [0x82434da]
[0:13] _ZN8CkLocMgr7deliverEP9CkMessage11CkDeliver_ti+0x266 [0x8244980]
[0:14] _ZN8CkLocMgr13deliverInlineEP9CkMessage+0x28 [0x8230d30]
[0:15] [0x822d7ac]
[0:16] _Z15_processHandlerPvP11CkCoreState+0x1af [0x822d961]
[0:17] CmiHandleMessage+0x3c [0x82bd27e]
[0:18] CsdScheduleForever+0x6b [0x82bd463]
[0:19] CsdScheduler+0x11 [0x82bd3d5]
[0:20] [0x82bb8f5]
[0:21] ConverseInit+0x342 [0x82bbe04]
[0:22] main+0x44 [0x8234d5b]
[0:23] __libc_start_main+0xe6 [0x670cc6]
[0:24] [0x815afc1]
Fatal error on PE 0>
How should I deal with it? Thank you, sir:-)
From: Xuehan Xu <xxhdx1985126 AT gmail.com>
Date: 16 October 2011 14:45
Subject: Question about running BigNetSim with "+wth4"
To: charm AT cs.uiuc.edu
Dear Sirs:
I tried to simulate the Cjacobi3D program with the parameter "+wth4", but some assertion error occured when running BigNetSim.
I used the emulator to run the program with "+wth4" like the following:
./charmrun +p1 ./jacobi 4 4 2 +x2 +y2 +z2 +wth4 ++remote-shell ssh +bglog
Then I moved the traces to BigNetSim/trunk/BlueGene/ and ran the BigNetSim:
[couple@node70 BlueGene]$ ../tmp/charmrun +p1 ../tmp/bigsimulator 1 0 ++remote-shell ssh
Charmrun> started all node programs in 2.110 seconds.
Converse/Charm++ Commit ID: v6.3.0-626-gf074431
Charm++> scheduler running in netpoll mode.
Charm++> Running on 1 unique compute nodes (2-way SMP).
Charm++> cpu topology info is gathered in 0.001 seconds.
================= Simulation Configuration =================
Number of physical PEs: 1
POSE mode: Parallel
Network model: BlueGene
Command line: /home/couple/NewCharm/BigNetSim/trunk/BlueGene/../tmp/bigsimulator 1 0
Timing factor: 1.000000e+08 (i.e., 1 GVT tick = 10 ns)
cpufactor: 1.000000
bgTrace summary: totalBGProcs=32 X=2 Y=2 Z=2 #CommThreads=1 #WorkerThreads=4 #PEs=1 LogVersion=6
Simulation mode: trace driven
Simulation network mode: full contention
Initializing POSE...
POSE initialization complete.
Using Inactivity Detection for termination.
Network parameters:
Max packet size: 256
File window size: 0
Debug print level: 0
Window load threshold: 0
Intra node latency: 0.500000 us
Intra node bandwidth: 1.000000 GB/s
Number of buffers per port in each switch: 12
Switch buffer size: 1024
Channel bandwidth: 1.000000
Channel delay: 0
Link stats collection interval: 1000000 GVT ticks
Link stats on: no
Message stats on: no
Adaptive routing on: yes
Header size: 16 bytes
Processor send overhead: 0 GVT ticks
Processor receive overhead: 0 GVT ticks
Number of simulated nodes: 8
============================================================
Info> invoking startup task from proc 0 ...
Info> Starting at the beginning of the simulation
Info> Running to the end of the simulation
[0] Assertion "inPort == numP" failed in file modDirectionOrderedNDTorus.C line 29.
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason:
[0] Stack Traceback:
[0:0] CmiAbort+0x75 [0x82b7130]
[0:1] __cmi_assert+0x3c [0x82bfdb7]
[0:2] _ZN26modDirectionOrderedNDTorus11selectRouteEiiiP8TopologyP6PacketRSt3mapIiiSt4lessIiESaISt4pairIKiiEEESC_Pt+0x187 [0x818b56d]
[0:3] _ZN10SwitchBase10recvPacketEP6Packet+0x5d1 [0x8160ce9]
[0:4] _ZN12state_Switch10recvPacketEP6Packet+0x24 [0x8161134]
[0:5] _ZN6Switch9ResolveFnEiPv+0xf1 [0x815f2b9]
[0:6] _ZN6adapt44StepEv+0x34b [0x81dc453]
[0:7] _ZN3sim4StepEv+0xc0 [0x81d7f22]
[0:8] _ZN6Switch10recvPacketEP6Packet+0x1e3 [0x81610cb]
[0:9] _ZN14CkIndex_Switch23_call_recvPacket_PacketEPvP6Switch+0x18 [0x815d28c]
[0:10] CkDeliverMessageFree+0x44 [0x822bc6d]
[0:11] _ZN14CkLocRec_local11invokeEntryEP12CkMigratablePvib+0x13a [0x8243246]
[0:12] _ZN14CkLocRec_local7deliverEP14CkArrayMessage11CkDeliver_ti+0x1bc [0x82434da]
[0:13] _ZN8CkLocMgr7deliverEP9CkMessage11CkDeliver_ti+0x266 [0x8244980]
[0:14] _ZN8CkLocMgr13deliverInlineEP9CkMessage+0x28 [0x8230d30]
[0:15] [0x822d7ac]
[0:16] _Z15_processHandlerPvP11CkCoreState+0x1af [0x822d961]
[0:17] CmiHandleMessage+0x3c [0x82bd27e]
[0:18] CsdScheduleForever+0x6b [0x82bd463]
[0:19] CsdScheduler+0x11 [0x82bd3d5]
[0:20] [0x82bb8f5]
[0:21] ConverseInit+0x342 [0x82bbe04]
[0:22] main+0x44 [0x8234d5b]
[0:23] __libc_start_main+0xe6 [0x670cc6]
[0:24] [0x815afc1]
Fatal error on PE 0>
How should I deal with it? Thank you, sir:-)
- [charm] Question about running BigNetSim with "+wth4", Xuehan Xu, 10/16/2011
- [charm] Fwd: Question about running BigNetSim with "+wth4", Xuehan Xu, 10/16/2011
Archive powered by MHonArc 2.6.16.