Ways to document how we save data to an HDF5 file

Has anyone come up with a documentation method to document how their project stores data into an HDF5 file.

  • What format (Markdown, Restructured Text, HTML, other? )
  • Do you use tables, hierarchical lists, outlines?

Essentially I am trying to document things like:

  • This group will be named “foo” and have 3 attributes of types {String, Int, float} with values of {“Hello”, 10, 3.14159}. The group can have 2 data sets, those have names “Bar” and “Baz” and have …

Things like that. With the ultimate goal of the final rendered documentation being actually easy to read and understand.

If anyone has any examlpes that they have seen for the layout of this kind of documentation I sure would appreciate it.

Thanks
Mike Jackson

1 Like

XDMF, https://www.xdmf.org, is an example that uses text files (XML) to describe the data. XML might not be the best for easier documentation inclusion than the ones you mentioned, but some projects convert XML to markdown, for example.

Maybe HDF5/JSON. See also the section on JSON schema. G.

One issue with HDF5/JSON for this task is that all HDF5 groups and datasets are represented flat which is not something users expect for a hierarchical data format. I tried to address this by proposing a YAML-based approach, described here. This was my little pet project, nothing official, and I did not do anything beyond a draft spec and several examples.

A related question is whether it is better for users to describe content in terms of domain-specific objects that happen to be encoded in HDF5, rather than having to do this encoding themselves and write it out at the HDF5 level of detail.

-Aleksandar

Yup, it’s all about the audience.

I like the approach taken by h5RDMtoolbox.

G.

I’ve personally heard same question asked many times so some kind of a solution seems to be long overdue. Is there (finally) enough interest in the wider HDF community to tackle it and develop something? This is going to be the topic of my Call the Doctor session next Tuesday.

-Aleksandar

1 Like

I took a very quick look at that project and they do seem to preset the information in a good way. I also am looking at https://github.com/oinanoanalysis/h5oina/blob/master/H5OINAFile.md as an example.

We would like to essentially have someone who does not run our software be able to read the doc and understand where to find their data in the hdf5 file. If someone really wants to write an HDF5 file that conforms to our spec it could be used for that also.