I work with an application that log data for time.
I don’t know how much data there will be logged. I prefer to write directly to the HDF5 file.
So i am looking for a streaming solution.
What are the options for this in HDF5 to get this done.
I know dat one of the options is to use the packet tabel dataset. (Inside C++ you can use the call H5PTcreate)
But this call isn’t availible inside the latest HDF.PInvoke package.
Have a look at this pull request (and the discussion): https://github.com/HDFGroup/HDF.PInvoke/pull/128
The problem is that the APIs in hdf5_hl.dll aren’t threadsafe and we didn’t want to create more confusion. If someone wants to do that work and write a few tests, we’d be happy to accept a pull request.
To continuously add (i.e. stream) data into an HDF5 file without knowing how many rows of data exist a priori, you need to use an extendible dataset.
If you are not bound to a particular library, take a look at HDFql. As an example, your use-case could be solved as follows using this library in C# (we assume that a function named acquire exists which populates variable values with a timestamp (UNIX Epoch Time) and an acquired value):
using System.Runtime.InteropServices;
using AS.HDFql;
[StructLayout(LayoutKind.Sequential, Pack = 0)]
struct Data
{
public int timestamp;
public float reading;
}
public class Example
{
public static void Main(string []args)
{
Data []values = new Data[1];
int number;
// create an HDF5 file named 'log.h5'
HDFql.Execute("CREATE FILE log.h5");
// use (i.e. open) HDF5 file 'log.h5'
HDFql.Execute("USE FILE log.h5");
// create a dataset named 'dset' of data type compound composed of two members named 'timestamp' (of data type int containing a UNIX Epoch Time)
// and 'reading' (of data type float containing an acquired value). The dataset starts with 0 rows and can grow (i.e. be extended) in an unlimited fashion
HDFql.Execute("CREATE DATASET dset AS COMPOUND(timestamp AS INT, reading AS FLOAT)(0 TO UNLIMITED)");
// register variable 'values' for subsequent use (by HDFql)
number = HDFql.VariableRegister(values);
// call hypothetical function 'acquire' that populates variable 'values' with a timestamp and an acquired value
while(acquire(values))
{
// alter (i.e. change) dimension of dataset 'dset' to +1 (i.e. add a new row at the end of 'dset')
HDFql.Execute("ALTER DIMENSION dset TO +1");
// insert (i.e. write) data from variable 'values' into the last row of dataset 'dset' (thanks to a point selection)
HDFql.Execute("INSERT INTO dset(-1) VALUES FROM MEMORY " + number);
}
}
}