A previous post mentioned that it may become possible to have indpendant datasets for independant processes, reducing the need for collective calls such as when extending a dataset.
That post is now 10yrs old. Has anything changed on this front? The link in the previous post is no longer working.
Full story
I have two seperate processes generating data, one generates values rapidly from an oscilicope, the other generates larger data more slowly from a camera.Therefore they are looping at different speeds.
I would like to write the data from each process into its own data set, this includes extending the data set, which is a collective call. Since the two proceses do not interact with the otherās dataset can this be made indepenant?
Mainly this is desired to save time by preventing processes waiting for each other.
Did you consider using ZeroMQ to collect your events and record them ? If you have interest, let me knowā¦
best: steve
MWE: mpi-extend
creates a container, and rank many datasets with random sizes using collective calls. Once the rig is set up, by default it extends a single dataset with collectiv ecall to demonstrate fitness of the rig but not the actual problem. By adjusting/uncommenting lines marked with NOTE: one can trigger the code relevant to the OPās question.
identification:
The MWE indeed shows for phdf5v1.10.6 all processes must participate in H5Dset_extent call, otherwise the program will hang indefinetally.
Iām not sure I understand the Full story. When you say
do you mean processes in general, or MPI processes, separate MPI applications, or something else?
Are the processes independent in the sense that no synchronization is needed?
OK, they acquire data at different rates, and you would like to put the data into different data sets.
What kind of data rates are you looking at, and whatās the connectivity? Are the devices hooked up to the same host? Local storage or NAS?
Datset extension a la H5Dset_extent is collective, provided we are using MPI. But then we are jumping to threads:
Threads, processes,ā¦, which one?
If they are independent, why not write the datasets into different HDF5 files and then stitch them together with a stub that contains just two external links. Or you can just merge/repack, if you need a single physical file (and compress or make them contiguous, depending on what you are looking for). If you donāt like the (temporary) duplication, then Stevenās ZMQ is a fine option.
If you have time, this sounds like another good topic for our HDF Clinic (next Tuesday).
Nice figure! (yEd?) Assuming I understand it correctly, my main concern would be the tight coupling between the two processes. (You were planning to use the MPI-IO VFD, right?) For example, at the moment, the dataset extension must be a collective operation. In other words, the āOscilloscopeā rank(s) must participate in the extension of the āCam Dataā dataset and the āCamera 1ā rank(s) must participate in the extension of the āScope Dataā dataset, at least as long as those datasets reside in the same file. You could get around this by placing those datasets into separate HDF5 files and then just have two external links in āRec1.h5ā. In that case, you wouldnāt need the MPI-IO VFD, unless there were multiple āoscilloscopeā or ācameraā ranks. In that case, you should consider opening those files on separate MPI communicators. Does that make sense? G.
MPI-IO VFD: Yes ( I think�) I have compiled the parallel version of HDF5 using MS MPI.
Using a file per process
If we where to use a seperate file per process would we still need to use a PHDF5 build or would we want to use a standard HDF5 build? assume we want a driver per process, how to make sure there is no interference�
File structure
We are planning to do a lot of sequential recordings and then access them all through virtual datasets (VDS).
My main concern was keeping the directory as clean as possible and making it easy to access recordings indivudally, hence my desire to use a single file.
External Links
I think external links might be a solution, or part of one. We could still access each recording layer via a single file.
Would we still have two seperate files? I take it we would end up with a structure like below:
and so a single entry point using Virtual files and External Paths would look something like:
Virtual datasets
However could we just use a virtual datasets generated post recording sequence? doubling the number of files in the dir isnt so badā¦and it looks like its happening either way
Apart from the obvious lack of unifed data setsā¦are there any inherent advanages/disadvantages to using external links rather than using a virtual datasets straight?
a. Do either of these options effect IO speed for chunked data sets?
b. How will either of these fail?
When the file is missing.
When a file exists but is corrupted.
Specifically will this completely break (and require restucturing to be used without the missing/corrupted file)?
Yes, no problem. You can use MPI and even MPI-IO (without HDF5) and use the sequential HDF5 VFDs (POSIX, core, etc.). File per process is fine. Unless you are using the MPI-IO VFD, you can just go with the non-parallel build.
Iām glad to see you are taking the time to plan and think these things through. You clearly have a data life cycle in mind (planning, acqusition, pre-processing, analysis, ā¦) and, as youāve seen already, different stages have often competing requirements. Keep in mind that HDF5 alone canāt solve all problems for you.
To keep the directory as clean as possible is a good goal, but does that mean āat all timesā and āat all costā or
āeventually?ā Planning for how you will manage and evolve your assets is part of the life cycle. Sometimes a simple tool such as h5repack does the trick, sometimes youāll need more than that.
Iām not sure I understand your concept of āvirtual files.ā Do you mean virtual datasets (VDS)? (VDS is the aility to have datasets that are backed by dataset elements of a compatible type stored in other datasets, including datasets in other files.) Assuming thatās the case, the differences would be mostly in how you locate and access the data. With external links you can quickly locate individual recordings by following the link structure. With VDS, since you are dealing with combined datasets, youād need a separate look-up structure to tell you which hyperslab or selection represents a particular recording. (You can query the VDS, but a separate lookup might be more convenient.) On the upside, reading across recordings, if that were a use case for you, would be very straightforward with VDS (single read), but a little more cumbersome with external links (iterations + read ops).
I wouldnāt expect a huge performance difference. Access to the dataset elements (in chunks) is the same in both cases; any difference would have to come from from the underlying metadata operations. Maybe link traversals and opening external files might be a tad slower than accessing VDS metadata, but that metadata will be in cache (eventually). You can whip up a quick benchmark if you really wanna know. I think convenience in supporting your use case(s) is more important for now.
VDS might be a little more forgiving with missing data. Corrupted files will be an issue either way. I think defensive programming and proper error handling will go a long way toward creating a robust solution, i.e., one that doesnāt fall apart when it encounters a missing or corrupt file.
shared file - a file open across multiple processes using PHDF5.
(not sure this is the correct terminology but this is what i mean)
Answer
Any modifications to the structure of a shared file must be collective calls made by all processes.
To have several processes modify datasets independantly within a shared file is not possible with PHDF5.
There are no plans to implement this functionality.
The real question appears to be: why do you need to do this?
For me the answer was tidyness and ease of access, for post-processing/data analytics.
In which case do we need it to be tidy all the time, Or can we tidy later?
Potential solutions/work arounds
Clean up after
Create a file per process and unify data afterwards, either via:
VDS: virtual data sets (access all dataset as one continuous dataset)
External links (access a single dataset in one file from another)
h5repack command line tool to copy hdf5 files and change chunking, compression
repack yourself, for more complex operations (merging datasets/files, post-processingā¦)
Independant processes, single file
Collect data and have a single writer thread/process
MPI - have each process send data to the writer process using MPI_Send
for large data this is slow
Communication with each extra process is done individually, more operations, less speed⦠(unless some tree structured collection is used)
MPI w/ windowed memory - share memory between processes, eliminiate the redundant copying/sending of MPI_Send ( only for local memory on individual nodes?)
MPI is designed for computer clusters: good comms between nodes etc
ZeroMQ - asynchronous messaging library, essentially an alternative MPI
collect your events and record them
Better for for this than MPI? asynchronous
Designed for distributed computing:
also has much better fault tolerance (errors do not generate system wide failure by defaultā¦)
Outcome
In the name of KISS I have decided to go with the clean up option, particularly since some post-processing is going to be required anywayā¦