File offset location of

I’m very new to the HDF5 format and was wondering if the following is possible. After reviewing the documentation I think the answer is “No” but I wanted someone with more knowledge to give a definite answer.
We have a hierarchical structure defined for a product we’re ingesting and it contains a Compound Datatype that only contains name value pairs. The byte location of these fields will be different between different files even though they have the same metadata hierarchy, right? The need is to ingest this data quickly and the thought was if the byte location was 1) near the beginning of the file and 2) at a constant offset we could just transfer a few KB or MB to the function reading it and manually unpack the bytes. These files could be several GB in size and transferring the entire file takes a significant amount of time.

Please check out H5Dget_offset(hid) and let me know if there is a significant difference using this call versus standard H5Dread() and if so under what conditions. (I never used it myself so far)

1 Like

I don’t think this is going to work for my use case. I’m trying to process some parts of the metadata without having to download the entire file from the Amazon S3 bucket where it resides.

You can use the ros3 VFD to read HDF5 files on S3. See: https://www.hdfgroup.org/solutions/cloud-amazon-s3-storage-hdf5-connector/.