···
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org] On Behalf Of Biddiscombe, John A.
Sent: Friday, February 25, 2011 5:11 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] multi-pass IO (was chunking)
Matthieu
what would be the best way of writing one file from all the processes together (in terms of write latency), knowing the data layout (regular 2D arrays),
if all processes are writing one piece of a single dataset (assuming I understood your question correctly), then the usual collective create of the dataset, followed by a hyperslab selection on each process and a write of individual pieces.
write by hyperslabs/chunks/patterns
is I think what you want.
Writing one dataset per process was something I wanted - an example of why might be most illustrative...
Suppose I'm working in paraview and have done some work in parallel on multi-block data, each process has a different block from the multi-block structure. They might be geometrically diverse (eg. tetrahedral on one process, prisms on another). I want to write out my current state, but don't want to do a collective write to one dataset. I really want to write each block out independently, but all to the same file.
Because each process has no idea what the others have got, I needed a way to gather the info and creat the 'structure' then write.
In the general case it'll be slower (physically more writes to disk), but for the purposes of organisation, much tidier.
JB
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org] On Behalf Of Matthieu Dorier
Sent: 25 February 2011 10:10
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] multi-pass IO (was chunking)
Hello John (and others, since maybe other people can answer the following questions)
Your library seems very interesting and I will probably use it in my project. Yet I have a question: what would be the best way of writing one file from all the processes together (in terms of write latency), knowing the data layout (regular 2D arrays),
- using the classic PHDF5 library and write by hyperslabs/chunks/patterns
- or using your library to split a dataset into "/procNNN/dataset"? It seems to me that writing regular patterns can benefit from MPI-IO's particular optimizations, but maybe I misunderstood the goal of your library?
Thank you,
Matthieu
2011/2/25 Mark Miller <miller86@llnl.gov<mailto:miller86@llnl.gov>>
John,
This is awesome! Thanks so much for putting it up.
I really wish the HDF5 Group had decided a long while ago to make this
kind of thing available UNDER the HDF5 API via...
a) adding either a H5Xcreate_deferred for an part, X, of the API or
adding a property to X's create property list to indicate a
desire for deferred creation
Any object so created cannot be acted upon until subsequent
H5Xsync_deferred()...
b) H5Xsync_deferred() function to synchronize all deferred created
objects.
But, in spite of numerous suggestions over many years that it'd be good
for parallel applications to be able to do this, it still hasn't found
its way into the HDF5 library proper 
Its so nice to see someone offer a suitable alternative 
Mark
On Thu, 2011-02-24 at 14:39, Biddiscombe, John A. wrote:
The discussion about chunking and two pass VFDs reminded me that I intended to make a small library for doing independent dataset creates, on a per process basis, available. It was created some time ago and used extensively on one project, but currently not in use.
I've tidied the code up a bit and uploaded it to the following page
https://hpcforge.org/plugins/mediawiki/wiki/libh5mb/index.php/Main_Page
the source code is available via the SCM link.
Some brief notes on the library are shown on the wiki page, but the actual API is probably best described in the H5MButil.h file. I created the wiki page very quickly so apologies if the content is unclear, please let me know if it needs improvement.
Hopefully someone will find the code useful.
JB
--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov<mailto:miller86@llnl.gov> urgent: miller86@pager.llnl.gov<mailto:miller86@pager.llnl.gov>
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-8511
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org<mailto:Hdf-forum@hdfgroup.org>
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
--
Matthieu Dorier
ENS Cachan, antenne de Bretagne
Département informatique et télécommunication
http://perso.eleves.bretagne.ens-cachan.fr/~mdori307/wiki/<http://perso.eleves.bretagne.ens-cachan.fr/~mdori307/wiki/>