One more question about hdf5 parallel data set creation:
here<http://www.hdfgroup.org/hdf5-quest.html#par-nodata>two
alternatives are presenting for writing data from a subset of MPI
ranks.
I will be writting to a lustre filesystem, and and was wondering about the
performance differences between the two techniques. I have ordered data in a
3D array, and the domain decomposition is done over the 2nd and 3rd indices
(Fortran code). So each MPI rank has a 3d array. The domain is decomposed,
and the processors layed out, as a 2D grid. I want to write 2D planar slices
of data out with high frequency. If the slice is taken as j=const or k=const
(where j and k are the 2nd and 3rd indices respectively, the indeces which
the separate subdomains) then only a relatively small (compared to the total
#) but still O(10-100) MPI ranks need to write data. I would say in general
less than 50% but more than 10% of the MPI ranks will be involved in this
write. Should I do it as a collective write where I guess there is some
over-head as the MPI ranks are cycled through in a predetermined order, or
should I hammer the lustrefs with uncoordinated individual writes,
concurrently? Any insight here is appreciated.
Many thanks,
Izaak Beekman
···
===================================
(301)244-9367
UMD-CP Visiting Graduate Student
Aerospace Engineering
ibeekman@umiacs.umd.edu
ibeekman@umd.edu