I use a commercial tool, written in Fortran, that creates an HDF5 results file.
This results file can then be visualized in a related tool written in C++ and Java,
possibly linked to the same Fortran library.
I have reverse-engineered most of the HDF5 format so that I can visualize my own data.
Unfortunately the visualization tool doesn’t display my own data, but gives no error.
According to h5dump, the only systematic difference between the two files is the
STRPAD option of strings used for both attribute and dataset definitions.
The original file format, produced by someone else’s Fortran has entries like:
DATATYPE H5T_STRING {
STRSIZE 100;
STRPAD H5T_STR_SPACEPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
and my own file, produced using Python-3.6.8 and h5py-2.9.0 has:
DATATYPE H5T_STRING {
STRSIZE 100;
STRPAD H5T_STR_NULLPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
The Low-Level API (Low-Level API — h5py 3.12.1 documentation) appears
to say that I could use class h5py.h5t.TypeStringID / set_strpad()
but I have
so far failed to work out how.
I experimented with variations of things in the “Strings in HDF5” section of
the online docs above, and also in the “Python and HDF5” book by A.Collette,
but without success, and temporarily went back to the incorrect but simpler:
grp.attrs['Description'] = numpy.bytes_("%-100s" % "Description")
Can someone provide an example of the correct way to set the Description
attribute so that it uses H5T_STR_SPACEPAD either via ‘standard’ h5py or
using the low level API?
UPDATE:
I’ve solved most of it, with one remaining problem…
The question above has been “on hold” in Draft mode after the Forum system
suggested that the topic is similar to the following article which I had to check:
As a result, I can replace my old High-Level API code such as the following:
hdf = h5py.File("test.h5", "w")
hdf.attrs['Title'] = numpy.array(numpy.bytes_("%-24s" % "Introduction"), ndmin=1)
with the Low-Level API code at the start of the following and get the STR_SPACEPAD
for the “Title” attribute, but I still get STR_NULLPAD for the fields of the Compound Type:
hdf = h5py.File("testing.h5", "w")
ascii24 = h5py.h5t.TypeID.copy(h5py.h5t.C_S1)
ascii24.set_size(24)
ascii24.set_strpad(h5py.h5t.STR_SPACEPAD)
dataspace = h5py.h5s.create_simple((1, ), (1, ))
attribute = h5py.h5a.create(hdf.id, "Title".encode("ascii"), ascii24, dataspace)
attribute.write(numpy.array(numpy.bytes_("%-24s" % "Introduction")))
person_type = h5py.h5t.create(h5py.h5t.COMPOUND, 48)
person_type.insert("firstName".encode("ascii"), 0, ascii24)
person_type.insert("lastName".encode("ascii"), 24, ascii24)
people = hdf.create_dataset("people", shape=(1,), maxshape=(1,), dtype=person_type)
people[0, "firstName"] = numpy.array(numpy.bytes_("%-24s" % "Abraham"))
people[0, "lastName"] = numpy.array(numpy.bytes_("%-24s" % "Lincoln"))
which results in the following h5dump testing.h5
listing:
HDF5 "testing.h5" {
GROUP "/" {
ATTRIBUTE "Title" {
DATATYPE H5T_STRING {
STRSIZE 24;
STRPAD H5T_STR_SPACEPAD; // This is what I want
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): "Introduction "
}
}
DATASET "people" {
DATATYPE H5T_COMPOUND {
H5T_STRING {
STRSIZE 24;
STRPAD H5T_STR_NULLPAD; // This should be H5T_STR_SPACEPAD
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
} "firstName";
H5T_STRING {
STRSIZE 24;
STRPAD H5T_STR_NULLPAD; // This should be H5T_STR_SPACEPAD
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
} "lastName";
}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): {
"Abraham ",
"Lincoln "
}
}
}
}
}
Where am I still going wrong when creating or assigning the Compound Type fields?