Need clarification of how to read variable length strings in C code

Looking at the H5Dread API, it isn’t clear to me whether the caller should allocate the memory of the buffer. It only shows one example for a data type of integer and the caller does create the memory buffer before making the call. Need to scroll to H5Dread() after opening the following link to the API docs to see the example I am referring to.

HDF5: Datasets (H5D)

Looking at other examples I have found and asking AI tools to generate code snips I’ve seen evidence suggesting that I need to preallocate memory as this example shows for variable length strings.

hdf5/HDF5Examples/C/H5T/h5ex_t_string.c at develop · HDFGroup/hdf5 · GitHub

However in practice it seems like the HDF library is allocating memory when reading variable length strings. I’ve just confirmed in some of our existing code, while debugging, that it does in fact create the memory within the library code. Why does that h5ex example code show that memory needs to be allocated when the library seems to do that?

The statement, “H5Dread() reads a dataset, specified by its identifier dset_id, from the file into an application memory buffer buf.” seems to imply to me that the application creates the memory buffer, but I don’t think it is that simple. The behavior seems to vary depending on the data type.

This is just a crude example, but it seems to work and I can see in the debugger that memory was allocated by the library. So what is the correct way to deallocate the memory when the library allocated it?

hid_t datatype_id = H5Dget_type(dataset_id);
H5T_class_t class_type = H5Tget_class(datatype_id);
if (H5T_STRING != class_type)
{
   std::cout << "dataset is not H5T_STRING."; 
   return -1;
}
else
{
   hsize_t dataset_size = H5Dget_storage_size(dataset_id);
}

htri_t is_variable = H5Tis_variable_str(datatype_id);

if (is_variable)
{
   // H5T_STRING is variable length string
   char* buffer[1] = { NULL };

   herr_t h5d_read_err = H5Dread(dataset_id, datatype_id,
      H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer);
// at this point, looks like buffer[0] is pointing to valid object so the library created memory and filled it
   if (0 > h5d_read_err)
   {
      return -1;
   }
   else if (NULL == buffer[0])  // seems to work because this is not null and HDF5 library does seem to allocate it
   {
      return -1
   }
   else
   {
      // do something with data or copy into another object
   }
   delete [] (buffer[0]);  // Is there an API call to properly release the memory? How do I know what method the API used to create memory?
   return 0;
}

Two cases:

  1. Variable-length strings: the library handles (allocation and) deallocation.
  2. Arbitrary variable-length datatypes: (the library handles allocation) call H5Treclaim

OK?