ppl-accel AT lists.siebelschool.illinois.edu
Subject: Ppl-accel mailing list
List archive
- From: Lukasz Wesolowski <wesolwsk AT illinois.edu>
- To: Ronak Buch <rabuch2 AT illinois.edu>
- Cc: "ppl-accel AT cs.uiuc.edu" <ppl-accel AT cs.uiuc.edu>
- Subject: Re: [ppl-accel] 5/9 Accel Meeting Minutes
- Date: Mon, 19 May 2014 20:46:53 +0800
- List-archive: <http://lists.cs.uiuc.edu/pipermail/ppl-accel/>
- List-id: <ppl-accel.cs.uiuc.edu>
I believe we were planning to have a meeting/telecon today at 11 am. Let's have everyone send a quick update on progress since the last meeting. We will meet if there are significant new results or issues.
In particular, here are some of the action items from the last meeting:
1. OpenAtom GPU runs: profiling and experiments on larger data sets (Eric)In particular, here are some of the action items from the last meeting:
On my end, I looked at cuBLAS to see if it can be supported in GPU Manager. As Ronak mentioned last time, cuBLAS now allows specifying a CUDA stream in which the operations should complete. I noticed that there are custom cuBLAS functions for transferring data to and from the GPU, so support for that would have to be explicitly added in GPU Manager. Overall, adding cuBLAS support to GPU Manager looks doable.
Lukasz
On Sat, May 10, 2014 at 12:58 AM, Ronak Buch <rabuch2 AT illinois.edu> wrote:
Ronak, Eric M., Michael, LukaszOverview of various accelerator tools that exist in Charm++Eric is using GPUs for OpenAtom, but testing was only on a very small data set; it's not clear if we are getting good performance since timing was not fine and input was small. Currently, it is using CuBLAS, so it uses synchronized kernel calls.
GPU Manager:
- Lukasz is currently supporting, plans to write documentation for it, fix stability if issues arise
- Task based library; instead of looking at GPU operations in isolation, group transfer to, computation, and transfer from as a single unit, and offload the whole thing.
- Using it will stop the CPU from being idle while the GPU is working.
- One key aspect is that it has its own memory pool for pinned memory used for GPU transfers. Otherwise, trying to alloc pinned memory while a kernel is executing will block.
- Not sure if overlapping communication with computation has changed in more recent versions of CUDA
Lukasz thinks that Offload API would be the best solution for the Xeon Phi over GPU Manager.Ronak and Michael worked on heterogeneous runs on with Xeon Phi, performance is rather slow.We should take Dave's thesis work (AEMs) and see how useful it is for various applications. Also, take a look at G-Charm (according to Lukasz, their techniques are basically the same as Kunzman's) (seems to be no code available for G-Charm)Sanjay's TODOs:
- Read Dave Kunzman's thesis
- Run Projections or other performance monitoring tools on Xeon Phi applications
- Add multiple ++ppn (SMP) for Xeon Phi.
_______________________________________________
ppl-accel mailing list
ppl-accel AT cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/ppl-accel
- [ppl-accel] 5/9 Accel Meeting Minutes, Ronak Buch, 05/09/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Lukasz Wesolowski, 05/19/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Ronak Buch, 05/19/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Mikida, Eric P, 05/19/2014
- Message not available
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Michael Robson, 05/19/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Mikida, Eric P, 05/19/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Mikida, Eric P, 05/26/2014
- Message not available
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Michael Robson, 05/26/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Mikida, Eric P, 05/19/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Michael Robson, 05/19/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Ronak Buch, 05/19/2014
- Message not available
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Michael Robson, 05/19/2014
- Re: [ppl-accel] 5/9 Accel Meeting Minutes, Lukasz Wesolowski, 05/19/2014
Archive powered by MHonArc 2.6.16.