Hi All,
I've been adding support for HDF5's split vfd to Silo, a library that
runs on top of HDF5 to read and write scientific mesh data.
First, split vfd is great! Works like a charm. Reduced I/O requests by 3
orders of magnitude by using core for meta and sec2 for raw. Thats
awsome!
Nonetheless, I've run into a number of peculiar issues with the 'file
splitting' aspect of this vfd and wanted to mention them, get feedback.
The overriding issue is that the HDF5 library itself presently does not
know of a file's 'splitness'. An application using HDF5 has to tell it
so prior to opening the file as well as the extensions used for meta and
raw parts. I think almost all of my problems would disappear if HDF5
'knew' a file was split via some kind of magic information contained in
either (or any one of the files if you are using the multi vfd) file or,
at the very least, in the meta file. That way, it would be possible to
pass to H5open a string that is the actual name of a file on disk and
HDF5 could just 'figure it out'
A consequence of this is that few if any of the hdf5 tools will be able
to operate on files that have been generated via split vfd. I am told
h5dump might work but haven't tried whats involved yet. I have a
solution for cases where only one file is opened at a time.
My Silo library sitting on top of HDF5 interacts with files NOT ONLY
through HDF5 library but also system calls (stat,access,...) as well.
But, the string one must pass to HDF5 to correctly open a split file is
NOT NECESSARILY the name of any actual file. So, all of Silo's system
calls can fail even though there is really a split file there for HDF5
to open. Worse, it could be the name of an actual file just not the
ACTUAL file the split vfd will open. Its properties may be entirely
different than the file(s) you reall want to open.
Finally, I have software on top of Silo that may open multiple files
generated by different user communities. And, each may use a different
convention for extension names for the meta and raw files of the split
vfd. That means the Silo library has to manage multiple extension
conventions. That isn't too bad by itself. However, if you have
"foobar.meta" and "foobar.raw" generated by application A using one set
of split vfd extensions and then another "foobar.aaa" and "foobar.bbb"
generated by application B using a different set of split vfd extension
conventions, you can no longer ask Silo to open "foobar". But, you also
can't open any of the other foobars as they are only one piece of a
split pair of files.
Puzzling, puzzling...
Mark
···
--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851