ppl-accel AT lists.siebelschool.illinois.edu

Subject: Ppl-accel mailing list

List archive

[ppl-accel] [accel] entry methods | CUDA : using GPU Manager

From: "Dokania, Harshit" <hdokani2 AT illinois.edu>
To: "dmkunzman AT gmail.com" <dmkunzman AT gmail.com>
Cc: "ppl-accel AT cs.uiuc.edu" <ppl-accel AT cs.uiuc.edu>
Subject: [ppl-accel] [accel] entry methods | CUDA : using GPU Manager
Date: Sat, 24 Jan 2015 17:12:49 +0000
Accept-language: en-US
List-archive: <http://lists.cs.uiuc.edu/pipermail/ppl-accel/>
List-id: <ppl-accel.cs.uiuc.edu>

Hello Dave,

I am reading your phd thesis. I have few questions regarding accel entry
method for GPGPUS.

CkIndex_xxxx class which is used as a hook for the runtime system to call
into the application code for entry methods.

In case of an AEM this function is actually broken up into two stages, a
general stage and a device-specific stage.

I looked into the def files but couldn't find cuda device specific functions.
There were general functions for accel entry methods.

I have a charm build of cuda, which currently performs offload operations and
kernel executions through GPU Manager.

It would be nice if you can guide where to look.

Also where can I find code for accel manager decision making for AEM
executions on host or device which is done on runtime.

I have more questions about when AEM's are batched then how are kernel thread
index mapped for different kernel launches or AEM. I read that it is one to
one mapping for AEMs in batch but in that case if a cuda kernel launch has
large number of threads <<<grid dim,block dim>>> how is that. Are all the
AEM's in batched set AEM are executed concurrently on device.

Note: Currently the charm build of cuda only takes GPU Manager into account
and calls these methods initHybridAPI(), gpuProgressFn(), and ExitHybridAPI()
from src/conv-core/convcore.c.

Regards,
Harshit

[ppl-accel] [accel] entry methods | CUDA : using GPU Manager, Dokania, Harshit, 01/24/2015