Write variable length UTF8 string attribute with Fortran API


#1

As by the title, I am trying to write it using only the HDF5 module and no C pointers. I would like to reach the following hdf5 file:

$> h5dump test.hdf5
HDF5 "test.hdf5" {
GROUP "/" {
   ATTRIBUTE "schema_name" {
      DATATYPE  H5T_STRING {
         STRSIZE H5T_VARIABLE;
         STRPAD H5T_STR_NULLTERM;
         CSET H5T_CSET_UTF8;
         CTYPE H5T_C_S1;
      }
      DATASPACE  SCALAR
      DATA {
      (0): "schema1"
      }
   }
}
}

This is my attempt:

program to_hdf5
  use hdf5
  implicit none
  character(len=100) :: filename
  integer :: error
  INTEGER(KIND=hid_t) :: file_id, space_id, attr_id, group_id, str_type
  integer(hsize_t), dimension(1) :: dims = (/10/) ! this is what I am not sure about

  call h5open_f(error)

  filename = 'test.hdf5'
  CALL h5fcreate_f(filename, h5f_acc_trunc_f, file_id, error)

  ! define the data type
  CALL H5Tcopy_f(H5T_STRING, str_type, error)
  CALL H5Tset_strpad_f(str_type, H5T_STR_NULLTERM_F, error)

  ! create a scalar dataspace for attribute
  CALL h5screate_f(h5s_scalar_f, space_id, error)

  ! create and write the attribute
  CALL h5acreate_f(file_id, 'schema_name', str_type, space_id, attr_id, error)    
  CALL h5awrite_f(attr_id, str_type, 'schema1', dims, error)    
    
  CALL h5aclose_f(attr_id, error)
  CALL h5sclose_f(space_id, error)
                      
  CALL h5fclose_f(file_id, error)
  CALL h5close_f(file_id)

end program to_hdf5   

This compiles but crashes with a seg fault. I think the problem is the dims array or the H5T_STRING type. If I instead pass H5T_FORTRAN_S1 to H5Tcopy_f, it works, but it write only the first character of the string. Also, it is fixed size and ASCII, and I would like it to be UTF8 and variable length.

Any suggestion?
Thanks


#2

Not sure if this is going to help or confuse, but I’ve recently had the same question for C#. The question with an example solution is here: How to create scalar variable length string

Here’s what I’m working on:



#3

First, if you plan on using variable length, you need to use the vlen type (h5tvlen_create_f). There are plenty of examples of how to do this here, https://github.com/HDFGroup/hdf5-examples/tree/master/FORTRAN/H5T.

The issue is we don’t have an h5awrite_vl_f equivalent of h5dwrite_vl_f since the way to do this is to use h5Awrite_f and pass the C address.

If you can’t use C_LOC, then you will need to use a dataset instead of an attribute.


#4

Hi @battaglia1803,

Not sure how to solve the use-case you described with the HDF5 module but with HDFql it could be solved as follows in Fortran:

PROGRAM Example
    USE HDFql
    INTEGER :: state
    state = hdfql_execute("CREATE FILE test.hdf5")
    state = hdfql_execute("CREATE ATTRIBUTE test.hdf5 schema_name AS UTF8 VARCHAR VALUES(schema1)")
END PROGRAM

Hope it helps!


#5

Ok, so it seems that without ISO_C_BINDING is not possible. So here is my working solution:

PROGRAM to_hdf5    
    
  USE hdf5    
  USE iso_c_binding    
    
  implicit none    
      
  character(len=100) :: filename    
  integer :: error    
  integer(kind=hid_t) :: file_id, space_id, attr_id, group_id, str_type    
  integer(kind=size_t) :: attr_size    
  type(c_ptr) :: buffer      ! what is passed to h5awrite    
  ! required "insulation layer" for vlen string to scalar dataspace    
  type(c_ptr), target :: vlen_to_scalar     
  character(len=7), target :: string = 'schema1'

  CALL h5open_f(error)
  
  filename = 'test.hdf5'
  CALL h5fcreate_f(filename, h5f_acc_trunc_f, file_id, error)                                             

  ! define the data type
  CALL h5tcopy_f(H5T_STRING, str_type, error)
  CALL h5tset_cset_f(str_type, H5T_CSET_UTF8_F, error)
  CALL h5tset_strpad_f(str_type, H5T_STR_NULLTERM_F, error)

  ! create a scalar dataspace for the attribute
  CALL h5screate_f(h5s_scalar_f, space_id, error)

  vlen_to_scalar = C_LOC(string)
  buffer = C_LOC(vlen_to_scalar) 
  CALL h5acreate_f(file_id, 'schema_name', str_type, space_id, attr_id, error)
  CALL h5awrite_f(attr_id, str_type, buffer, error)

  CALL h5aclose_f(attr_id, error)
  CALL h5sclose_f(space_id, error)
  CALL h5fclose_f(file_id, error)

  CALL h5close_f(error)

END PROGRAM to_hdf5 

I think the same happens as for @philip.lee, with the explanation given in the other post. Somehow I need to call C_LOC twice, which is weird, but works.