Extracting Data from Compressed Datasets using HDFql?


#1

Hello all,

I have an HDF5 file made up of a couple of datasets in the form of:

GROUP: metadata
DATASET: metadata
GROUP: traces
DATASET: DItraces
DATASET: traces

I’ve been able to access datasets like this in my current code, but these new files’ datasets are lzf compressed (utilizing h5py). Does HDFql offer data extraction? Referencing their manual states it is capable of compressing data, but does not show the usage for extraction.

It is possible for me to extract the data utilizing Python’s h5py library, but I’m working on a C/C++ program to access and manipulate the data. I could possibly write a Python program to convert the data into a file format that works with my program, but I’d rather not add an intermediate step for data processing: I’d rather implement it in my C/C++ program.

TLDR; HDF5 files’ datasets are written in lzf compressed format through Python’s h5py library, how can I extract the datasets utilizing HDFql?

Let me know what you all think, thanks in advance!


#2

UPDATE: It seems the lzf filter is written in C. As such, will be looking into whether or not I can use it in tandem with HDFql.

Will post findings.


#3

When using C++ you may consider H5CPP, presentation slides are here.
Currently only the basic filters are implemented, however adding filters is trivial by using the necessary CAPI calls on the property list. The property descriptors between C++ and CAPI: h5::dcpl_t vs. dcpl hid_t are automatically converted to one another at compile time.

best: steve