Writing compound datatype with internal vector

Hello!

I’m working on writing a compound datatype to an HDF5 dataset using C++. My struct includes an internal std::vector whose length is consistent across all instances, but can vary depending on user input.

My struct looks something like this:
struct CompDT
{
std::vector<uint64_t> fieldA;
uint64_t fieldB;
uint64_t fieldC;
};

I have seen that HDF5 doesn’t support variable-length arrays inside compound types directly so am wondering how others approach this. I am using the H5Cpp API for context.

HDF supports variable length datatypes within compounds, but the HDF5 C++ API does not support using std::vector in this way. This is because the data is handled internally by the C library and the C++ API does not attempt to convert the raw data in the write buffer into a format C recognizes. Putting it all together I think something like this will work (I am not familiar with the C++ API so it’s possible I bungled something):

struct CompDT
{
hvl_t fieldA;
uint64_t fieldB;
uint64_t fieldC;
};
H5::VarLenType vltype(H5::PredType::H5T_NATIVE_UINT64);
H5::CompType comptype(sizeof(struct CompDT));
comptype.insert("fieldA", offsetof(struct CompDT, fieldA), vltype);
comptype.insert("fieldB", offsetof(struct CompDT, fieldB), H5::PredType::H5T_NATIVE_UINT64);
comptype.insert("fieldC", offsetof(struct CompDT, fieldC), H5::PredType::H5T_NATIVE_UINT64);

Then when initializing the data:

struct CompDT wbuf[100];
std::vector<unit64_t> mystdvector ...;
...
wbuf[i].fieldA.len = (size_t)mystdvector.size();
wbuf[i].fieldA.p = mystdvector.data();