Defining and retrieving structured data?


I have been using HDF5 on and off, currently returning to it for a computational geometry project.

I am looking for the correct terminology to describe what I am looking for which I believe there was but my search fails me.

I’d like to store a structure of information

  face_index (int32)
  centroid (float32[2])

I have no problem storing an array of index and a separate array of position

But I recall there was some structure mechanism which allow us to retrieve whole element of a structure.

Or maybe my memory is failing me.

Thank you in advanced.


Hi @yue.nicholas,

What you are looking for is called a compound data type in HDF5. It is meant to store structured data just like the one you have posted.

As an example, to store structured data in C using HDFql (a high-level declarative language that simplifies handling HDF5) could be done as follows:

// declare structure
struct my_data
   int face_index;
   float centroid[2];

// declare variables
struct my_data data;
char script[1024];

// populate variable 'data' with some dummy values
data.face_index = 10;
data.centroid[0] = 15.2;
data.centroid[1] = 17.4;

// create an HDF5 file named 'test.h5' and use (i.e. open) it
hdfql_execute("CREATE AND USE FILE test.h5");

// register variable 'data'

// prepare script to create a compound dataset named 'dset' which stores the values of variable 'data'
sprintf(script, "CREATE DATASET dset AS COMPOUND(face_index AS INT OFFSET %ld, centroid AS FLOAT(2) OFFSET %ld) VALUES FROM MEMORY 0", offsetof(struct my_data, face_index), offsetof(struct my_data, centroid));

// execute script

Besides C, HDFql supports C++, C#, Python, Java, Fortran and the R programming languages.


Great example, Mr. HDFql! Can you attach the h5dump output for everyone’s benefit?

The elements of the dataset created are of an HDF5 compound datatype. This is just another way of saying that you are dealing with records. All records of a given type have a set of named user-visible fields. With HDF5, you can read or write subsets of fields (partial records) over the entire dataset or just parts of it.



Sure! Here is the output of running h5dump against file test.h5 (which was generated by the C code snippet above):

HDF5 "test.h5" {
GROUP "/" {
   DATASET "dset" {
         H5T_STD_I32LE "face_index";
         H5T_ARRAY { [2] H5T_IEEE_F32LE } "centroid";
      DATA {
      (0): {
            [ 15.2, 17.4 ]