The simplest case when the data is a few hundred megabyte – slurping it up in a single IO operation will not have impact on the application performance; and you have no intention to extend the data set. Adding complexity will slow you down implementing it (perusing docs, etc…) with little to no benefit:
100x100x500 @double
is 5x10^6 x sizeof(double)
which is 40MB
on a 5 yo laptop this should take about 1ms
on a newer one with nvme I ballpark it to 0.2ms
.
Alternatively you are dealing with large datasets. There are two sub-cases here:
- there is a deadline, soft or hard
- total utilisation of IO bandwidth
Either way you need to use chunks, and for optimal performance have to factor in where the data is located. For Ethernet or local hardrive 1MB
chunk size seems to be good option; whereas infiniband like fabrics may prefer smaller chunks.
When using chunks you are manipulating the data layout, the elements within a chunk are adjacent/contiguous which makes it cheap to access as a unit. The typical example is a massive in-memory matrix scanned by rows, then by columns – it turns out this makes a difference. (actually on CUDA and similar architecture this makes a big difference and is called coalescing
In your case you have to be sure you pack as much related data in a single chunk as possible – and the chunk size is roughly 512KB - 1MB
. – To lower IO latency you can go down to the kernel page size, which may be architecture dependent: 4kB - 64kB
, on my linux it is 4kB
:
cd /proc/1
sudo grep -i pagesize smaps
So for I opened up a few considerations, and didn’t giv emuch thought to syntax, which should be similar to python h5py. Luckily few years ago @gheber convinced me of the advantage of pythonic
syntax, so the following C++ example gets as close to it as possible:
#include <armadillo> // I am using linear algabra library here, as it has matrix
#include <h5cpp/all>
int main(){
arma::mat M(100,100); // http://arma.sourceforge.net/docs.html
h5::fd_t fd = h5::create("some_file_name.h5",H5F_ACC_TRUNC);
// this is an extendable data set, with initial size, compressed with gzip,
// and set to fill value `-1.0`
h5::ds_t ds = h5::create<double>(fd,"/path/to/dataset"
,h5::current_dims{1,100,100}
,h5::max_dims{H5S_UNLIMITED,100,100}
,h5::chunk{1,100,100} | h5::fill_value<double>{-1.0} | h5::gzip{9});
h5::write( ds, M, h5::offset{0,0,0});
} // RAII will close all resources
The above example is from H5CPP examples: linear algebra, and in real applications the fill value is rarely set. Also for single shot write there is no reason to create a data set upfront, much cleaner this way:
#include <armadillo> // I am using linear algabra library here, as it has matrix
#include <h5cpp/all>
int main(){
arma::mat M(100,100); // http://arma.sourceforge.net/docs.html
h5::fd_t fd = h5::create("some_file_name.h5",H5F_ACC_TRUNC);
// this is identical with the previous example
h5::ds_t ds = h5::write(fd,"/path/to/dataset", M,
h5::current_dims{1,100,100}, h5::offset{0,0,0},
h5::chunk{1,100,100} ,h5::max_dims{H5S_UNLIMITED,100,100});
// NOT identical, but practical for single shot, non-extendable datasets
h5::write(fd,"/some/other dataset", M);
} // RAII will close all resources
Idiomatically shows single shot create + write
operation for two distinct cases, but in some cases you may need to record data from a stream: sensor networks, stock market, camera, set of microphones, …
#include <armadillo> // I am using linear algabra library here, as it has matrix
#include <h5cpp/all>
int main(){
size_t nframes=1'000'000;
arma::mat M(100,100); // http://arma.sourceforge.net/docs.html
h5::fd_t fd = h5::create("some_file_name.h5",H5F_ACC_TRUNC);
h5::pt_t pt = h5::create<double>(fd, "stream of matrices",
h5::max_dims{H5S_UNLIMITED,nrows,ncols}, h5::chunk{1,100,100} );
// actual code, you may insert arbitrary number of frames: nrows x ncols
for( int i = 0; i < nframes; i++)
h5::append(pt, M);
} // RAII will close all resources
With the exception of packet table h5:pt_t
all resources/descriptors are binary compatible with the HDF5 CAPI, in other words you can pass appropriate hid_t
to H5CPP IO operators, or versa, H5CPP descriptors to CAPI function calls.
H5CPP is an MIT licensed project with LLVM based optional compiler assisted introspection, . You find the documentation here and download it from my github page.
best wishes: steven