Hi!
First, thanks for creating hdf5, which is incredibly helpful for so many people!
I'm currently working on hyperspectral images. We've got a camera that writes one frame at a time into a rank-3 hdf5 dataset; the slowest varying index of the dataset is the frame number. To avoid corrupt files, we currently split each recording (think something along a video recording) into separate hdf5 files, approx. 1 GB in size (configurable). Working with the split files/dataset is doable, but less elegant than putting one big dataset into one big file, obviously.
- Is there a way to ensure the integrity of partial recordings (against power loss, software crashes, you name it) without splitting them into all these small files?
- Is there a way to create a "master file" that uses symbolic/external links to "link together" all the datasets (one per file) into something that looks like a dataset from a hdf5 user (h5py, matlab, ...)? I've noticed the "drivers" [1] that talk about split files, but I'm uncertain whether each sub-file is a valid hdf5 file? H5FD_MULTI superficially looks like what we need.
- Can virtual datasets [2] be used from older (1.8.x) clients? Will it work for this purpose?
- Or are we missing some great idea or feature in HDF5?
Cheers,
Paul
[1] <https://support.hdfgroup.org/HDF5/Tutor/filedrvr.html#predef>
[2] <https://support.hdfgroup.org/HDF5/docNewFeatures/NewFeaturesVirtualDatasetDocs.html>