Write data to variable length string attribute

Hi,

I can create valid variable length string attribute and when I write data to it there is no errors or exception but when I open the file with HdfViewer I can see that it contains Null value.

Could someone explain why I can’t write my data to var-len-str attribute?

#include "hdf5.h"

#define H5FILE_NAME "Attributes.h5"

#define DATASET_RANK  1   /* Rank and size of the dataset  */
#define DATASET_SIZE  7

#define ANAMES "Character attribute" /* Name of the string attribute */

int main(int argc, char *argv[])
{
  hid_t   file, dataset, attr;        /* File and dataset identifiers */
  hid_t   dataset_space, attr_space;  /* Dataset's and Attribute's dataspace id */
  hid_t   attr_type;                  /* Attribute type */
  herr_t  status;                     /* Return value */

  hsize_t fdim[] = {DATASET_SIZE};
  unsigned char data[] = {'A', 'S', 'D', '\0'};     /* Data to be written to the attribute */


  file = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

  dataset_space = H5Screate(H5S_SIMPLE);
  status = H5Sset_extent_simple(dataset_space, DATASET_RANK, fdim, NULL);

  dataset = H5Dcreate2(file, "my_dataset", H5T_NATIVE_INT, dataset_space, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);

  attr_space  = H5Screate(H5S_NULL);
  attr_type = H5Tcopy(H5T_C_S1);
  status = H5Tset_size(attr_type, H5T_VARIABLE);
  status = H5Tset_strpad(attr_type, H5T_STR_NULLTERM);
  status = H5Tset_cset(attr_type, H5T_CSET_UTF8);

  attr = H5Acreate2(dataset, ANAMES, attr_type, attr_space, H5P_DEFAULT, H5P_DEFAULT);

  status = H5Awrite(attr, attr_type, data);

  H5Fclose(file);
}

I ve found that I should have created written data as char* data[1] = {"ASD"};. Now I have working code but the main idea was in the following.

If I create variable length string attribute, write data to it, then I open file once again, delete attribute and create it againg and write the same data to it, then I can see that the file is grown by 4 kilobytes (Windows 10). Here is the code:

#include <filesystem>
#include <iostream>

#include "hdf5.h"

#define H5FILE_NAME "Attributes.h5"
#define ATTR_NAME "VarLenAttr" /* Name of the string attribute */

hid_t createAttr(hid_t& file, hid_t& attr_type){
  herr_t status = H5Tset_size(attr_type, H5T_VARIABLE);

  hid_t attr_space  = H5Screate(H5S_SCALAR);
  return H5Acreate(file, ATTR_NAME, attr_type, attr_space, H5P_DEFAULT, H5P_DEFAULT);
}

int main(int argc, char *argv[])
{
  char* data[1] = {"ASD"};

  hid_t file, attr;
  hid_t attr_type = H5Tcopy(H5T_C_S1);

  if( !std::filesystem::exists(H5FILE_NAME) ) {
    // if file is not created we create it and create attribute to the file
    std::cout << "Creating file..." << std::endl;
    
    file = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
    attr = createAttr(file, attr_type);
  } else {
    // if file already exists then we delete attribute and create it again
    std::cout << "Opening file..." << std::endl;
    
    file = H5Fopen(H5FILE_NAME, H5F_ACC_RDWR, H5P_DEFAULT);
//    attr = H5Aopen(file, ATTR_NAME, H5P_DEFAULT);
    H5Adelete(file, ATTR_NAME);
    attr = createAttr(file, attr_type);
  }

  H5Awrite(attr, attr_type, data);

  H5Fclose(file);
}

Test it you need to lauch the app, then check the filesize, run app again.

Is it bug or it behaves as it should?

I also interested in doing the same without deleting attribute but rather opening attribute and overwrighting it. But I get error that tells me that it can’t write data because source and destination datatypes are different.

I have to refresh my mind, but this is the correct behavior for non-fixed-size datatypes. If your attribute were an integer or a fixed-size string, it would end up in the object header and be freed “as expected.” (Can you try that?) Fixed-size attributes values live in the object header (root group), heap otherwise.

The H5Awrite should work. Maybe you forgot to set the H5Tset_size(attr_type, H5T_VARIABLE)? But even in that case, the file will keep growing.

In either case, you can reclaim that space with h5repack.

Which version of HDF5 are you using?

G.

1 Like

@gheber thank you for response,

I’ve just tested the same procedure with fixed-size string and in this case the filesize is not growing.

Yes, I should have set H5Tset_size before overwrighting attribute.

And yes again, h5repack frees up the file.

I use HDF5 1.12.0.

By the way, if I do the same with variable-length string dataset then the filesize doesn’t grow. Thus overwriting variable length attributes affects the filesize and overwriting dataset doesn’t. So how do you think, will this filesize growth invoked by overwriting attribute fixed in the future?

Here is my updated test:

#include <filesystem>
#include <iostream>

#include "hdf5.h"

#define H5FILE_NAME "Attributes.h5"
#define ATTR_NAME "VarLenAttr" /* Name of the string attribute */

hid_t createAttr(hid_t& file, hid_t& attr_type){
  hid_t attr_space  = H5Screate(H5S_SCALAR);
  return H5Acreate(file, ATTR_NAME, attr_type, attr_space, H5P_DEFAULT, H5P_DEFAULT);
}

int main(int argc, char *argv[])
{
  char* data[1] = {"ASD"};

  hid_t file, attr;
  hid_t attr_type = H5Tcopy(H5T_C_S1);
  herr_t status = H5Tset_size(attr_type, H5T_VARIABLE);   // for fixed-size string I just replace H5T_VARIABLE with 100

  if( !std::filesystem::exists(H5FILE_NAME) ) {
    // if file is not created we create it and create attribute to the file
    std::cout << "Creating file..." << std::endl;

    file = H5Fcreate(H5FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
    attr = createAttr(file, attr_type);
  } else {
    // if file already exists then we overwright attribute (or delete and create it again)
    std::cout << "Opening file..." << std::endl;

    file = H5Fopen(H5FILE_NAME, H5F_ACC_RDWR, H5P_DEFAULT);
    attr = H5Aopen(file, ATTR_NAME, H5P_DEFAULT);
    // uncomment this to delete and create attribute again
//    H5Adelete(file, ATTR_NAME);
//    attr = createAttr(file, attr_type);
  }

  H5Awrite(attr, attr_type, data);

  H5Fclose(file);
}

The 4K price tag looks a little hefty indeed. Let me check on that and get back to you. G.

1 Like

Keep an eye on https://jira.hdfgroup.org/browse/HDFFV-11215 !

1 Like

Thank you for reporting!

Unfortunately, this behavior is due to current implementation. I will leave the issue open since some optimization can be done. The issues was updated with this information.

1 Like