Using HDFView to build a new sample HDF5 file

Hi! I’m a new user to HDF5 and HDFView. I’m working on building a sample HDF5 file using HDFView to hand off to my development team. However, I need to embed some existing files as datasets within the HDF5 file I’m creating. Is there a way to import an existing JSON, XML, etc…file as a dataset or is the only way to attach existing files as a link?

Melissa, welcome, you have at least two options. The first is straightforward, the second is a little more advanced (i.e., not suitable for beginners):

  1. Assuming you have your JSON/XML/… document loaded into a string variable or a byte array, you can save that as the value of a (scalar) HDF5 dataset or attribute. The easiest way to achieve this is h5py. Have a look at the many examples in “Python and HDF5.” I’ve seen folks store pickled Python objects, certificates, or even executables that way.

  2. An HDF5 file can have a so-called user block, space at the beginning of the HDF5 file that you can set aside for your own purposes, and that will be ignored by the HDF5 library. You can use the HDF5 command line utilities h5jam and h5unjam (See https://portal.hdfgroup.org/display/HDF5/h5jam+-+h5unjam) for that. If you have multiple files, you could, for example, zip or tar them, and h5jam that archive into the HDF5 file.

Best, G.

G, Thank you for giving me some options. The path I was going down was to try and find & add the I/O modules for xml and JSON into HDFView. Getting those jar files and adding them to the HDFView library would allow me to register those file formats in HDFView. Then I could create objects with those file formats and import the string or text of the existing files into the created objects.
Thanks again for your help!

R, Melissa