Hi,
I’m having trouble reading my data in pandas so I’m trying to replicate the H5 structure that panda can read.
At the moment pandas reads my strings like this: b"test\x00\x00\x00\x00’\x0bv\xdbR\xa5… and I don’t want the padding, I just want to see ‘test’
My current setup with the C lib is this:
hid_t atype = H5Tcopy(H5T_C_S1);
H5Tset_size(atype, 36);
H5Tset_strpad(atype, H5T_STR_NULLTERM);
H5Pset_char_encoding(atype, H5T_CSET_ASCII);
with the H5 structure looking like this:
DATASET “testdata” {
DATATYPE H5T_COMPOUND {
H5T_STRING {
STRSIZE 36;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
} “name”;
}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): "test", "one", "two", "three"
}
}
However I want the H5 structure to look like this, pandas can interpret this and show no padding:
DATASET “testdata” {
DATATYPE H5T_STRING {
STRSIZE 36;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): “test”, “one”, “two”, “three”
}
Does anyone know how I can setup my H5 structure to mimick this? It would appear that the string type is it’s own datatype? How would I do that with the library?
All the best
/P