HDF5 for scientists: h5py, HDF5/JSON, and storing linked data in HDF5 files - Call the Doctor September 20, 2022

In our next Call the Doctor session on Tuesday, September 20th, 1:00 p.m. central time (US/Canada), senior informatics architect Aleksandar Jelenak (@ajelenak) will take over the hosting duties. Aleksandar has a background as a scientist working with geospatial and geoscience data, and now works at The HDF Group to make HDF5 and related tools more effective for other scientists. On Tuesday, he will talk about h5py, HDF5/JSON, and storing linked data in HDF5 files. Bring your questions (drop them in the google doc ahead of time if you can) or just tune in to learn more about these tools from Aleksandar’s unique perspective.

Register to join us!

1 Like

Hello, during this session at the end, @ajelenak mentioned an example HDF5 file containing linked data and that it would be uploaded to the forum. However, I can’t seem to find this file, can you help me? Thanks

Here’s the recording from the September 20th session of Call the Doctor. (Our youtube stats show this has been a very popular session!)

1 Like

Hi @mathias.vandenauweel,

The owner of that file still has not responded to my request to allow sharing it. I will ask again.

-Aleksandar

1 Like

Thanks! I’m mainly interested in how to use linked data, or URI’s inside the HDF5 format. What are the best practices. I can’t seem to find much information on that topic on the web.

Hello @mathias.vandenauweel!

Hope it’s not too late… here’s the high-level documentation describing how linked data (RDF triples) are stored: https://docs.allotrope.org/ADF%20Quad%20Store%20API.html#architecture.

I am interested in a best practice for storing linked data in HDF5 rather than each user community developing its own. If you have some ideas and are interested, let’s see how we can kick-start a discussion.

-Aleksandar

Hello @ajelenak, the link you provide should be sufficient for our use case, thank you! I see no reason to deviate from those rules.
I think my primary use case will not need a HDF5 triple store approach. I can keep the data in the normal table structure. But I was mainly concerned about the metadata. if we use the linked data URI approach in the metadata, then that will be optimal.