ppl-accel AT lists.siebelschool.illinois.edu

Subject: Ppl-accel mailing list

List archive

[[ppl-accel] ] [TMS] New log in Task Accel Minutes

From: Michael Robson <mprobson AT illinois.edu>
To: mikida2 AT illinois.edu, gplkrsh2 AT illinois.edu, mille121 AT illinois.edu, ppl-accel AT cs.illinois.edu
Subject: [[ppl-accel] ] [TMS] New log in Task Accel Minutes
Date: Thu, 03 Sep 2015 14:13:13 -0500

A new log has been added to Task: Accel Minutes by Michael Robson
The text of the log is:
Accel-Node Summit 11 - 1:30 AM in SC 4102
In Attendance: Harshitha, Phil, Eric, Ronak, Michael, Harshit, Sanjay

Single node
- any PE can execute progress fn
- this is a concern at a large number of cores
- can check queue without log
- can't accquire work request without it
- maybe limit to PE0
- another way of enqueue'ing it would be nice

NOTES: Where is gpu manager created?

Accel Framework
- document to write accel programs

Node level
- w/in node: k's of cores
- 1 PE per node
- 1 execution stream
- endpoint
- needs a better name
- OpenMP, OmpSs
- Drones & CkLoop
- 1 PE per core
- Charm++ SMP
- Charm++ SMP & CkLoop
- Multiple PE's Node
- CkLoop, OpenMP, OmpSs
- Drones & CkLoop

Load balance w/in a node more often (why don't we do this now?)

Within node LB
- chares are assigned to PE's
- could be re-assigned w/o waiting for global LB
- be able to dynamically change cores we run on
cores = execution stream
- choose to only run on 1 of 2 SMT
- How can we enhance current SMP style?

Learn more synchronization primitives

Different units of scheduling and migration

Provide good infrastructure for 3 versions
- should be separated out into different files
- do not want to parse or anything
- xeon integrated gpu's
- same memory domain
- have shared L4 cache
- ~200-900 GFlops
- OpenCL (maybe OpenMP or OpenACC)
- Ehsan's work w/ Maria

Agenda
- M/N model?
- Also give M = #cores special care
- Notion of tasks
- Final block inside a when of SDAG
- SMP model w/ transient LB
- should be adequate
- Use cases
- Jacobi
- LeanMD
- Intersection
- Keeping data on device and notifying node scheduler
- OmpSs has a directory which keeps track, what memory where
- Do locality based scheduling
- gOMP or Clang project

OmpSs
- Keeps track of non-ready tasks
- and their data dependencies
- Predictive scheduling

Tasks
- Current SMP multi-core
w/in node LB
and heirarchical LB
- Hetereogenous LB (Michael + Ronak)
- Multi-target/version scheduling (Ronak + Michael)
- Transient in-node migration (Harshitha)
- Migrating chares as needed (merge?)
- Refactoring CmiMyRank == LastThread throughout runtime (Ronak?)
- Phil, Harshitha, Ronak, and Sanjay discuss
- Fix drone thread scheduler, touches previous issue; not production
(Harshitha)
- M/N scheme
- Shrink/Expand PE/Drone's on node (Harshitha)
- Relaunch via shrink/expand
- Thread affinity (Harshitha)
- Provide threads/scheduling for on-node model RT libs (OmpSs, etc) (Prateek?)
- gOMP, Argo
- really: gOMP on Converse
- but really really: OpenMP run time on top of Converse
- Sending packing
- Drone task, CkLoop
- Worker disabling i.e. turn off looking at node queue for execution (Michael)
- test program
- Task model/API (Harshitha)
- Task queue's (Harshitha)
- SDAG to create tasks
- RO enforcement (Seonmyeong)
- Make OmpSs work well aka experiment & tune w/ on-node model integration
(Harshitha + Seonmyeong)
- Charm + X (where X is OmpSs)

Goals
- Hetereogenous LB
- ID threads as: comm, worker, gpu?, io?

Large machines with some nodes w/ gpus and some w/o
- Blue Waters
- LANL ExMatEx?
- Target LB here would be a good resarch exercise
- Taub and Golub
- LSU SuperMIC

Personal notes:
- Is there a separate scheduler in accel??
- or does it override the default scheduler?

Final List:
MR/RB - multi-target/version scheduling
RB/MR - hetero LB
HM - transient in-node migration
RB - thread creation/ID refactoring
HM - pe/drone shrink/expand
HM - thread affinity
HM - follow up on drone model functionality
- after ID/creation fixes
PJ - provide threads/scheduling for on-node model RT libs
MR - worker disabling node-queue execution
HM - task API
HM - task Q
S - ReadOnly enforcement
HM/S - experiment & tune w/ on-node integration
- Charm + X

To view this item, click on or cut-paste
https://charm.cs.illinois.edu/private/tms/listlog.php?param=1490#15725

--
Message generated by TMS

[[ppl-accel] ] [TMS] New log in Task Accel Minutes, Michael Robson, 09/03/2015