Reading variable-length strings into fixed-size char array

Hi,

I'd like to be able to read variable-length strings from HDF5 file
into fixed-size array but could not find a way to do it. What I have
is this dataset in the file:

            DATASET "eventcodes" {
               DATATYPE H5T_COMPOUND {
                  H5T_STD_U16LE "code";
                  H5T_STRING {
                     STRSIZE H5T_VARIABLE;
                     STRPAD H5T_STR_NULLTERM;
                     CSET H5T_CSET_ASCII;
                     CTYPE H5T_C_S1;
                  } "desc";
               }

I'd like very much to read this dataset into this in-memory structure:

    struct eventcodes {
        uint16_t code;
        char[32] desc;
    };

I know that all strings in file are shorter than 32 bytes, so this should
be safe to do. When I define this in-memory data type for the structure:

    hid_t tid = H5Tcreate ( H5T_COMPOUND, sizeof(eventcodes) ) ;
    H5Tinsert(tid, "code", 0, H5T_NATIVE_UINT16);
    hid_t strType = H5Tcopy(H5T_C_S1);
    H5Tset_size(strType, 32);
    H5Tinsert(tid, "desc", 2, strType);

and try to read dataset with it I get errors from HDF5 about conversion
problem:

  #000: H5Dio.c line 174 in H5Dread(): can't read data
    major: Dataset
    minor: Read failed
  #001: H5Dio.c line 337 in H5D_read(): unable to set up type info
    major: Dataset
    minor: Unable to initialize object
  #002: H5Dio.c line 836 in H5D_typeinfo_init(): unable to convert between src and dest datatype
    major: Dataset
    minor: Feature is unsupported
  #003: H5T.c line 4449 in H5T_path_find(): no appropriate function for conversion path
    major: Datatype
    minor: Unable to initialize object

I thought that this kind of conversion would be pretty straightforward
and easy to implement, but apparently it's not done. Did I do something
wrong here or is my assumption just incorrect?

I'm using 1.8.6 but would be happy to switch to new version if this
conversion is supported in later versions.

Thanks,
Andy

Andrei, knowing that the lengths of all the strings in the file
don't exceed 32 doesn't help. The string type in the file is variable
length and not fixed length and the library doesn't do any kind of automatic
conversion for you. The call

H5Tset_size(strType, 32);

is well intended but wrong under the circumstances. It ought to be

H5Tset_size(strType, H5T_VARIABLE);

Have a look at

http://www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/hdf5-examples/1_8/
C/H5T/h5ex_t_vlstring.c

Best, G.

···

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Salnikov, Andrei A.
Sent: Thursday, February 02, 2012 2:18 PM
To: HDF Users Discussion List
Subject: [Hdf-forum] Reading variable-length strings into fixed-size char
array

Hi,

I'd like to be able to read variable-length strings from HDF5 file into
fixed-size array but could not find a way to do it. What I have is this
dataset in the file:

            DATASET "eventcodes" {
               DATATYPE H5T_COMPOUND {
                  H5T_STD_U16LE "code";
                  H5T_STRING {
                     STRSIZE H5T_VARIABLE;
                     STRPAD H5T_STR_NULLTERM;
                     CSET H5T_CSET_ASCII;
                     CTYPE H5T_C_S1;
                  } "desc";
               }

I'd like very much to read this dataset into this in-memory structure:

    struct eventcodes {
        uint16_t code;
        char[32] desc;
    };

I know that all strings in file are shorter than 32 bytes, so this should be
safe to do. When I define this in-memory data type for the structure:

    hid_t tid = H5Tcreate ( H5T_COMPOUND, sizeof(eventcodes) ) ;
    H5Tinsert(tid, "code", 0, H5T_NATIVE_UINT16);
    hid_t strType = H5Tcopy(H5T_C_S1);
    H5Tset_size(strType, 32);
    H5Tinsert(tid, "desc", 2, strType);

and try to read dataset with it I get errors from HDF5 about conversion
problem:

  #000: H5Dio.c line 174 in H5Dread(): can't read data
    major: Dataset
    minor: Read failed
  #001: H5Dio.c line 337 in H5D_read(): unable to set up type info
    major: Dataset
    minor: Unable to initialize object
  #002: H5Dio.c line 836 in H5D_typeinfo_init(): unable to convert between
src and dest datatype
    major: Dataset
    minor: Feature is unsupported
  #003: H5T.c line 4449 in H5T_path_find(): no appropriate function for
conversion path
    major: Datatype
    minor: Unable to initialize object

I thought that this kind of conversion would be pretty straightforward and
easy to implement, but apparently it's not done. Did I do something wrong
here or is my assumption just incorrect?

I'm using 1.8.6 but would be happy to switch to new version if this
conversion is supported in later versions.

Thanks,
Andy

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Gerd,

thanks for reply. I guess my question was exactly why does not
library provide automatic conversion? Is it too difficult/impossible
to implement? I do not think that argument about user being unable
to guess maximum size is a good one. If user provides size that is
smaller than actual string library can always generate an error and
let user correct the situation.

For me being able to use fixed-size strings on input would be a big
improvement. Handling of variable-size strings is more complicated
than of the fixed strings, one has to remember to call H5Dvlen_reclaim
which may not be trivial thing as the code which reads the data and
the code that uses/destroys the data may live in separate modules.

Cheers,
Andy

···

Gerd Heber wrote on 2012-02-05:

Andrei, knowing that the lengths of all the strings in the file don't
exceed 32 doesn't help. The string type in the file is variable length
and not fixed length and the library doesn't do any kind of automatic
conversion for you. The call

H5Tset_size(strType, 32);

is well intended but wrong under the circumstances. It ought to be

H5Tset_size(strType, H5T_VARIABLE);

Have a look at

http://www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/hdf5-
examples/1_8/ C/H5T/h5ex_t_vlstring.c

Best, G.

-----Original Message----- From: hdf-forum-bounces@hdfgroup.org
[mailto:hdf-forum- bounces@hdfgroup.org] On Behalf Of Salnikov, Andrei
A. Sent: Thursday, February 02, 2012 2:18 PM To: HDF Users Discussion
List Subject: [Hdf-forum] Reading variable-length strings into
fixed-size char array

Hi,

I'd like to be able to read variable-length strings from HDF5 file into
fixed-size array but could not find a way to do it. What I have is this
dataset in the file:

            DATASET "eventcodes" {
               DATATYPE H5T_COMPOUND {
                  H5T_STD_U16LE "code";
                  H5T_STRING {
                     STRSIZE H5T_VARIABLE;
                     STRPAD H5T_STR_NULLTERM;
                     CSET H5T_CSET_ASCII;
                     CTYPE H5T_C_S1;
                  } "desc";
               }
I'd like very much to read this dataset into this in-memory structure:

    struct eventcodes {
        uint16_t code;
        char[32] desc;
    };
I know that all strings in file are shorter than 32 bytes, so this
should be
safe to do. When I define this in-memory data type for the structure:

    hid_t tid = H5Tcreate ( H5T_COMPOUND, sizeof(eventcodes) ) ;
    H5Tinsert(tid, "code", 0, H5T_NATIVE_UINT16);
    hid_t strType = H5Tcopy(H5T_C_S1);
    H5Tset_size(strType, 32);
    H5Tinsert(tid, "desc", 2, strType);
and try to read dataset with it I get errors from HDF5 about conversion
problem:

  #000: H5Dio.c line 174 in H5Dread(): can't read data
    major: Dataset minor: Read failed #001: H5Dio.c line 337 in
    H5D_read(): unable to set up type info major: Dataset minor: Unable
    to initialize object
  #002: H5Dio.c line 836 in H5D_typeinfo_init(): unable to convert
between
src and dest datatype
    major: Dataset
    minor: Feature is unsupported
  #003: H5T.c line 4449 in H5T_path_find(): no appropriate function for
conversion path
    major: Datatype
    minor: Unable to initialize object
I thought that this kind of conversion would be pretty straightforward
and easy to implement, but apparently it's not done. Did I do something
wrong here or is my assumption just incorrect?

I'm using 1.8.6 but would be happy to switch to new version if this
conversion is supported in later versions.

Thanks,
Andy

_______________________________________________ Hdf-forum is for HDF
software users discussion. Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________ Hdf-forum is for HDF
software users discussion. Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org