charm AT lists.siebelschool.illinois.edu
Subject: Charm++ parallel programming system
List archive
- From: Jozsef Bakosi <jbakosi AT lanl.gov>
- To: charm AT lists.cs.illinois.edu
- Subject: [charm] All-to-all or redn+bcast
- Date: Fri, 14 May 2021 09:51:30 -0600
- Authentication-results: ppops.net; spf=pass smtp.mailfrom=jbakosi AT lanl.gov; dkim=pass header.s=lanl header.d=lanl.gov
Hi folks,
I wanted to know your expert opinion on the following.
We have an all-to-all, computing a min of single scalar real value,
among many chares intended to be running at large scales. This amounts
to our single synchronization point within a time step.
I wonder if replacing the single all-to-all with a reduction + broadcast
targeting each chare may allow for more overlap. I believe a single
all-to-all is implemented as a redn+bcast to/from a single chare, and
the complexity of what I'm suggesting is probably worse, nevertheless
worth asking.
In code, with DG being a chare array, I'm suggesting to replace
contribute( sizeof(double), &mindt, CkReduction::min_double,
CkCallback(CkReductionTarget(DG,solve), thisProxy) );
with
for all DG chares i
contribute( sizeof(double), &mindt, CkReduction::min_double,
CkCallback(CkReductionTarget(DG,solve), thisProxy[i]) );
end
Would this allow for more overlap by removing the global sync or I would
throw the baby out with the bathwater because I am replacing the log(n)
algorithmic/parallel complexity with n due to the for loop?
Thanks,
Jozsef
--
Jozsef Bakosi, PhD, LANL CCS-2, o:505-665-0950, c:505-695-4523
- [charm] All-to-all or redn+bcast, Jozsef Bakosi, 05/14/2021
- Re: [charm] All-to-all or redn+bcast, Eric Mikida, 05/17/2021
- Re: [charm] [EXTERNAL] Re: All-to-all or redn+bcast, Jozsef Bakosi, 05/17/2021
- Re: [charm] All-to-all or redn+bcast, Eric Mikida, 05/17/2021
Archive powered by MHonArc 2.6.19.