Adding Multiple Column headers to H5TB

How do I add multiple column headers in HDF5 Table so that its rendered by HDFView

Fox example, I want to have field names as well as field units.
I can add attributes using H5LTset_attribute_string. But is there a way to add such that it is reflected in the column header in HDFView

#include "hdf5.h"
#include "hdf5_hl.h"
#include "mpi.h"

#include <cassert>
#include <string>

int main()
{
  herr_t err;
  hid_t file_id = H5Fcreate("foo.out",H5F_ACC_TRUNC,H5P_DEFAULT,H5P_DEFAULT);
  constexpr int COLUMN_NAME_WIDTH = 10; 
  typedef struct{
    double time;
    char   foo1[COLUMN_NAME_WIDTH];
    int    foo2;
  }Data;
  constexpr int NROWS = 5;
  constexpr int NCOLS = 3;

  size_t dataset_size = sizeof(Data);
  size_t dataset_offset[NROWS] = {
    HOFFSET(Data,time),
    HOFFSET(Data,foo1),
    HOFFSET(Data,foo2)
  };
  const char *field_names[NCOLS] = {"Time","foo1","foo2"};
  const char *field_units[NCOLS] = {"(s)","(none)","(K)"};
  hid_t H5T_CUSTOM_STRING = H5Tcopy(H5T_C_S1);
  err = H5Tset_size(H5T_CUSTOM_STRING,COLUMN_NAME_WIDTH);
  hid_t field_types[NCOLS] = {H5T_NATIVE_DOUBLE,H5T_CUSTOM_STRING,H5T_NATIVE_INT};
  Data data[NROWS] = {
    {0.0,"Haha0",0},
    {0.1,"Haha1",1},
    {0.2,"Haha2",2},
    {0.3,"Haha3",3},
    {0.4,"Haha4",4}
  };
  size_t chunk_size =NROWS;//Number of rows to be written to memory at once
  Data fill_data[1] = {{1.,"HahaN",__INT_MAX__}};//optional input
  int compress = 0;//Enable/Disable compression
  //Create the table
  err = H5TBmake_table(
      "Table Title",//Optional attribute
      file_id,
      "Table Name",
      NCOLS,
      NROWS,
      dataset_size,
      field_names,
      dataset_offset,
      field_types,
      chunk_size,
      fill_data,//Optional attribute
      compress,
      data
   );
  //Note : "Table_Title"(optional),field_names and fill_data(optional) are attributes
  assert(err==0 && "Table creation failed");
    
  for(size_t i = 0; i < NCOLS; ++i) {
    std::string attr_name = std::string("UNIT_") + field_names[i]; 
    err = H5LTset_attribute_string(file_id, "Table Name", attr_name.c_str(), field_units[i]);
    assert(err == 0 && "Unable to set units attribute");
  }

  //Close File
  err = H5Fclose(file_id);
  assert(err==0 && "Unable to close file");
  return 0;
}

Hi! I have no idea but I’m going to try with your code to see. BTW, does it need to be H5TB, instead of the C library?

I don’t think that attributes of the form UNIT_<field name> are part of the HDF5 Table specification (link).

Thanks…
I could do it as a normal dataset in the C library. I was more interested if it could be rendered as a table in HDFView and I could use the python H5TB python interface to parse it

Yes… I was hoping for a hack to have multiple headers using the attributes available in the specification

It is reasonable to want to extend HDFView to display multiple column header lines. Your use case for units is very appropriate. However I agree with @ajelenak, this is not in the current HDF5 Table specs.

I took a quick look at the HDFView code supporting table view. Support for even the single header line is extensive. I see roughly how new code for multiple headers could be added, but it would be a pile of work.

The specs are the easy part. I can imagine simple schemes to generically support one or more extra header lines, patterned somewhat after what is done now for the single header line.

The specs may be an easy part but they take time to adopt. So perhaps we can build on the momentum here and make some progress…

As for recording physical units information, I think the H5TB attributes of this form are acceptable: FIELD_<n>_UNITS. For example: FIELD_0_UNITS = "s", or FIELD_2_UNITS = "K".

What other new information would be desirable for each column?

I suggest that for added header lines, make them fully generic from the HDFView point of view. Do not try to foresee or dedicate special purposes such as “units”. Provide a numbering scheme such that users can arbitrarily add e.g. FIELD_1_HEADER_2, FIELD_4_HEADER_7, etc. Also provide an associating convention such as HEADER_2_LABEL = "units", HEADER_3_LABEL = "limit", etc.

The top header line is already special, the field name. So just leave it that way, for backward compatibility.

The developer said this will be added to the enhancement list.

3 Likes

Here is an alternative generic scheme to my earlier suggestion FIELD_<n>_HEADER_<n>. This way is not very human readable or expressive.

How about FIELD_<n>_<label>, as suggested by @ajelenak, but the labels are user specified except for the original FIELD_<n>_NAME. Then the display order of header lines is user specified by HEADER_<n> = "label".

For example, borrowing from above:

FIELD_0_UNITS = "s"
FIELD_2_UNITS = "K"
HEADER_1 = "NAME"
HEADER_2 = "UNITS"