String and attribute lengths

Hello all,

     A couple questions (using hdf5 1.8.4 on linux with g++).
First, I'm wondering if you can create a
unlimited array of strings. I've modified the hdf5 example codes to
generate a file which contains something like:

      ATTRIBUTE "A1" {
         DATATYPE H5T_STRING {
               STRSIZE 7;
               STRPAD H5T_STR_SPACEPAD;
               CSET H5T_CSET_ASCII;
               CTYPE H5T_C_S1;
            }
         DATASPACE SIMPLE { ( 4 ) / ( H5S_UNLIMITED ) }
         DATA {
         (0): "Parting", "is such", "sweet ", "sorrow."
         }
      }

However, this isn't exactly what I want, since the datatype
specifically specifies 7-character strings, and I'd prefer that
neither the length of the string array nor the length of any
individual string was limited. I have a feeling that the answer
to my question is that trying to store an unlimited array of
unlimited length strings is a bad
idea, because one would have to enable chunking for each
element in the string array separately, which would be pretty
wasteful in terms of hdf file storage and access. Let me know
if this suspicion is correct.

     Additionally, I'd like to work with unlimited length attributes.
I've successfully created an unlimited length attribute with
something of the form:

    // Create the attribute space
    hsize_t adims=s.size();
    hsize_t amax=H5S_UNLIMITED;
    hid_t attr_space=H5Screate_simple(1,&adims,&amax);

    // Set attribute chunk with size determined by def_chunk()
    hid_t dcpl2=H5Pcreate(H5P_DATASET_CREATE);
    hsize_t chunk2=def_chunk(s.size());
    int status3=H5Pset_chunk(dcpl2,1,&chunk2);

    // Create the attribute
    hid_t attr=H5Acreate(dset,"len",H5T_STD_I8LE,attr_space,H5P_DEFAULT,dcpl2);

then, in order to read the attribute later, I can do

    // Get attribute dimensions
    attr_space=H5Aget_space(attr);
    hsize_t adims;
    int nadims=H5Sget_simple_extent_dims(attr_space,&adims,0);

One question is that this looks a little unusual to me, as I was
initially thinking that I'd need to use a function named
H5Aget_simple_extent_dims(), but no such function exists. Maybe this
is still ok. However, I'm having trouble resizing the attribute after the
fact. I've tried

      hsize_t new_adims=s.size();
      int status3=H5Dset_extent(attr,&new_adims);

but this gives me an error:

HDF5-DIAG: Error detected in HDF5 (1.8.4) thread 0:
  #000: H5D.c line 1077 in H5Dset_extent(): not a dataset
    major: Invalid arguments to routine
    minor: Inappropriate type

I've tried using H5Aset_extent(), but no function exists with that
name. So the final question is what I'm doing wrong with
set_extent().

Thank you,
Andrew

Hi Andrew,

Hello all,

    A couple questions (using hdf5 1.8.4 on linux with g++).
First, I'm wondering if you can create a
unlimited array of strings. I've modified the hdf5 example codes to
generate a file which contains something like:

     ATTRIBUTE "A1" {
        DATATYPE H5T_STRING {
              STRSIZE 7;
              STRPAD H5T_STR_SPACEPAD;
              CSET H5T_CSET_ASCII;
              CTYPE H5T_C_S1;
           }
        DATASPACE SIMPLE { ( 4 ) / ( H5S_UNLIMITED ) }
        DATA {
        (0): "Parting", "is such", "sweet ", "sorrow."
        }
     }

However, this isn't exactly what I want, since the datatype
specifically specifies 7-character strings, and I'd prefer that
neither the length of the string array nor the length of any
individual string was limited. I have a feeling that the answer
to my question is that trying to store an unlimited array of
unlimited length strings is a bad
idea, because one would have to enable chunking for each
element in the string array separately, which would be pretty
wasteful in terms of hdf file storage and access. Let me know
if this suspicion is correct.

  I think you want a dataset with unlimited dimensions and a datatype that involved variable-length strings. You are correct though, there's a lot of indirection involved for that combination and it may not perform especially well.

    Additionally, I'd like to work with unlimited length attributes.
I've successfully created an unlimited length attribute with
something of the form:

   // Create the attribute space
   hsize_t adims=s.size();
   hsize_t amax=H5S_UNLIMITED;
   hid_t attr_space=H5Screate_simple(1,&adims,&amax);

   // Set attribute chunk with size determined by def_chunk()
   hid_t dcpl2=H5Pcreate(H5P_DATASET_CREATE);
   hsize_t chunk2=def_chunk(s.size());
   int status3=H5Pset_chunk(dcpl2,1,&chunk2);

   // Create the attribute
   hid_t attr=H5Acreate(dset,"len",H5T_STD_I8LE,attr_space,H5P_DEFAULT,dcpl2);

then, in order to read the attribute later, I can do

   // Get attribute dimensions
   attr_space=H5Aget_space(attr);
   hsize_t adims;
   int nadims=H5Sget_simple_extent_dims(attr_space,&adims,0);

One question is that this looks a little unusual to me, as I was
initially thinking that I'd need to use a function named
H5Aget_simple_extent_dims(), but no such function exists. Maybe this
is still ok. However, I'm having trouble resizing the attribute after the
fact. I've tried

     hsize_t new_adims=s.size();
     int status3=H5Dset_extent(attr,&new_adims);

but this gives me an error:

HDF5-DIAG: Error detected in HDF5 (1.8.4) thread 0:
#000: H5D.c line 1077 in H5Dset_extent(): not a dataset
   major: Invalid arguments to routine
   minor: Inappropriate type

I've tried using H5Aset_extent(), but no function exists with that
name. So the final question is what I'm doing wrong with
set_extent().

  Ah, sorry, HDF5 doesn't currently support attributes with unlimited dimensions. (The H5Acreate call should actually fail in that case, I'll file a bug report for it).

  Quincey

···

On Mar 5, 2010, at 2:48 PM, Andrew W. Steiner wrote:

Thank you again Quincey. In case it helps, an h5dump of the associated
file (see below) shows that the HDF library thinks it is indeed creating an
attribute with unlimited dimnesions:

   DATASET "testsa2" {
      DATATYPE H5T_STD_I8LE
      DATASPACE SIMPLE { ( 24 ) / ( H5S_UNLIMITED ) }
      DATA {
      (0): 84, 104, 105, 115, 105, 115, 97, 116, 101, 115, 116, 46, 65, 110,
      (14): 111, 116, 104, 101, 114, 116, 101, 115, 116, 46
      }
      ATTRIBUTE "len" {
         DATATYPE H5T_STD_I8LE
         DATASPACE SIMPLE { ( 4 ) / ( H5S_UNLIMITED ) }
         DATA {
         (0): 4, 2, 1, 5
         }
      }
   }

···

On Fri, Mar 5, 2010 at 4:50 PM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Andrew,

On Mar 5, 2010, at 2:48 PM, Andrew W. Steiner wrote:

Hello all,

A couple questions \(using hdf5 1\.8\.4 on linux with g\+\+\)\.

First, I'm wondering if you can create a
unlimited array of strings. I've modified the hdf5 example codes to
generate a file which contains something like:

 ATTRIBUTE &quot;A1&quot; \{
    DATATYPE  H5T\_STRING \{
          STRSIZE 7;
          STRPAD H5T\_STR\_SPACEPAD;
          CSET H5T\_CSET\_ASCII;
          CTYPE H5T\_C\_S1;
       \}
    DATASPACE  SIMPLE \{ \( 4 \) / \( H5S\_UNLIMITED \) \}
    DATA \{
    \(0\): &quot;Parting&quot;, &quot;is such&quot;, &quot;sweet  &quot;, &quot;sorrow\.&quot;
    \}
 \}

However, this isn't exactly what I want, since the datatype
specifically specifies 7-character strings, and I'd prefer that
neither the length of the string array nor the length of any
individual string was limited. I have a feeling that the answer
to my question is that trying to store an unlimited array of
unlimited length strings is a bad
idea, because one would have to enable chunking for each
element in the string array separately, which would be pretty
wasteful in terms of hdf file storage and access. Let me know
if this suspicion is correct.

   I think you want a dataset with unlimited dimensions and a datatype that involved variable\-length strings\.  You are correct though, there&#39;s a lot of indirection involved for that combination and it may not perform especially well\.
Additionally, I&#39;d like to work with unlimited length attributes\.

I've successfully created an unlimited length attribute with
something of the form:

// Create the attribute space
hsize_t adims=s.size();
hsize_t amax=H5S_UNLIMITED;
hid_t attr_space=H5Screate_simple(1,&adims,&amax);

// Set attribute chunk with size determined by def_chunk()
hid_t dcpl2=H5Pcreate(H5P_DATASET_CREATE);
hsize_t chunk2=def_chunk(s.size());
int status3=H5Pset_chunk(dcpl2,1,&chunk2);

// Create the attribute
hid_t attr=H5Acreate(dset,"len",H5T_STD_I8LE,attr_space,H5P_DEFAULT,dcpl2);

then, in order to read the attribute later, I can do

// Get attribute dimensions
attr_space=H5Aget_space(attr);
hsize_t adims;
int nadims=H5Sget_simple_extent_dims(attr_space,&adims,0);

One question is that this looks a little unusual to me, as I was
initially thinking that I'd need to use a function named
H5Aget_simple_extent_dims(), but no such function exists. Maybe this
is still ok. However, I'm having trouble resizing the attribute after the
fact. I've tried

 hsize\_t new\_adims=s\.size\(\);
 int status3=H5Dset\_extent\(attr,&amp;new\_adims\);

but this gives me an error:

HDF5-DIAG: Error detected in HDF5 (1.8.4) thread 0:
#000: H5D.c line 1077 in H5Dset_extent(): not a dataset
major: Invalid arguments to routine
minor: Inappropriate type

I've tried using H5Aset_extent(), but no function exists with that
name. So the final question is what I'm doing wrong with
set_extent().

   Ah, sorry, HDF5 doesn&#39;t currently support attributes with unlimited dimensions\.  \(The H5Acreate call should actually fail in that case, I&#39;ll file a bug report for it\)\.

   Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org