Hdf java : problem for dataset of string

Hi,

I experiencing problems while reading dataset of string with hdf-java. I
can create a new 2 dimension dataset of string with dimension 3 x 2 and
length 100 both in code and HDFview. The h5dump utilities shows as
expected :

$ h5dump test.h5
HDF5 "test.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_STRING {
            STRSIZE 5;
            STRPAD H5T_STR_NULLPAD;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
      DATASPACE SIMPLE { ( 3, 2 ) / ( 3, 2 ) }
      DATA {
      (0,0): "\000\000\000\000\000", "\000\000\000\000\000",
      (1,0): "\000\000\000\000\000", "\000\000\000\000\000",
      (2,0): "\000\000\000\000\000", "\000\000\000\000\000"
      }
   }
}
}

When I read this dataset in hdfview or in my code. The array of data is
not read correctly.
In HDFview, the table shown is represented as a dataset with only one
column. However the metadata indicates 3x2.
In my code using hdf-java, I cannot read completely the dataset. The
read array as only 2 value.

H5ScalarDS dataset = ....
....
long[] dims = dataset.getDims();
The values of dims are [3, 2], which is correct.

Object data = dataset.read();
The Array.getLength(data) is 2, and I expect 6 to read my strings.

There is no problem for dataset with other types, with integer for
example.

I'm using hdf-java-2.7 and HDFView Version 2.7.

Thanks

Guillaume

Guillaume,

Text strings can be only viewed one-dimension a time in HDFView. There is not good
way to show multiple dimensional array of strings. See details at User's Guide at
http://www.hdfgroup.org/hdf-java-html/hdfview/UsersGuide/ug07textview.html.

Bu default, hdf-java only loads strings one dimension a time. However, you can
select whatever subset of your array. For example, the following code will select the
whole dataset.
         // set this before dataset.read()
         int rank = dataset.getRank();
         long[] dims = dataset.getDims();
         long[] selected = dataset.getSelectedDims();
         System.arraycopy(dims, 0, selected, 0, rank);

Thanks
--pc

···

On 3/29/2011 3:46 AM, Guillaume PRIN wrote:

Hi,

I experiencing problems while reading dataset of string with hdf-java. I
can create a new 2 dimension dataset of string with dimension 3 x 2 and
length 100 both in code and HDFview. The h5dump utilities shows as
expected :

$ h5dump test.h5
HDF5 "test.h5" {
GROUP "/" {
    DATASET "test" {
       DATATYPE H5T_STRING {
             STRSIZE 5;
             STRPAD H5T_STR_NULLPAD;
             CSET H5T_CSET_ASCII;
             CTYPE H5T_C_S1;
          }
       DATASPACE SIMPLE { ( 3, 2 ) / ( 3, 2 ) }
       DATA {
       (0,0): "\000\000\000\000\000", "\000\000\000\000\000",
       (1,0): "\000\000\000\000\000", "\000\000\000\000\000",
       (2,0): "\000\000\000\000\000", "\000\000\000\000\000"
       }
    }
}

When I read this dataset in hdfview or in my code. The array of data is
not read correctly.
In HDFview, the table shown is represented as a dataset with only one
column. However the metadata indicates 3x2.
In my code using hdf-java, I cannot read completely the dataset. The
read array as only 2 value.

H5ScalarDS dataset = ....
....
long[] dims = dataset.getDims();
The values of dims are [3, 2], which is correct.

Object data = dataset.read();
The Array.getLength(data) is 2, and I expect 6 to read my strings.

There is no problem for dataset with other types, with integer for
example.

I'm using hdf-java-2.7 and HDFView Version 2.7.

Thanks

Guillaume

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Here is a method that will get the data as a two dimensional string array.

     /**
      * Gets a 2d string array from the given hdf5 file and dataset.

···

*
      * @param hdfFile the hdf5 file object
      * @param path the full path to the dataset
      *
      * @return the dataset's data
      *
      * @throws java.lang.Exception
      */
     public static String[][] get2dStringArray(FileFormat hdfFile, String path) throws Exception
     {
         String[][] stringArray = null;
         Dataset dataset = (Dataset) hdfFile.get(path);

         if (dataset != null)
         {
             dataset.init();

             long[] dims = dataset.getDims();
             long[] start = dataset.getStartDims();
             stringArray = new String[(int) dims[0]][(int) dims[1]];

             for (int i = 0; i< dims[0]; i++)
             {
                 start[0] = i;

                 byte[] dataRead = dataset.readBytes();
                 stringArray[i] = Dataset.byteToString(dataRead,
                         dataset.getDatatype().getDatatypeSize());
             }

             dataset.clear();
         }

         return stringArray;
     }

--Christian

On 3/29/2011 9:23 AM, Peter Cao wrote:

Guillaume,

Text strings can be only viewed one-dimension a time in HDFView. There is not good
way to show multiple dimensional array of strings. See details at User's Guide at
http://www.hdfgroup.org/hdf-java-html/hdfview/UsersGuide/ug07textview.html.

Bu default, hdf-java only loads strings one dimension a time. However, you can
select whatever subset of your array. For example, the following code will select the
whole dataset.
        // set this before dataset.read()
        int rank = dataset.getRank();
        long[] dims = dataset.getDims();
        long[] selected = dataset.getSelectedDims();
        System.arraycopy(dims, 0, selected, 0, rank);

Thanks
--pc

On 3/29/2011 3:46 AM, Guillaume PRIN wrote:

Hi,

I experiencing problems while reading dataset of string with hdf-java. I
can create a new 2 dimension dataset of string with dimension 3 x 2 and
length 100 both in code and HDFview. The h5dump utilities shows as
expected :

$ h5dump test.h5
HDF5 "test.h5" {
GROUP "/" {
    DATASET "test" {
       DATATYPE H5T_STRING {
             STRSIZE 5;
             STRPAD H5T_STR_NULLPAD;
             CSET H5T_CSET_ASCII;
             CTYPE H5T_C_S1;
          }
       DATASPACE SIMPLE { ( 3, 2 ) / ( 3, 2 ) }
       DATA {
       (0,0): "\000\000\000\000\000", "\000\000\000\000\000",
       (1,0): "\000\000\000\000\000", "\000\000\000\000\000",
       (2,0): "\000\000\000\000\000", "\000\000\000\000\000"
       }
    }
}

When I read this dataset in hdfview or in my code. The array of data is
not read correctly.
In HDFview, the table shown is represented as a dataset with only one
column. However the metadata indicates 3x2.
In my code using hdf-java, I cannot read completely the dataset. The
read array as only 2 value.

H5ScalarDS dataset = ....
....
long[] dims = dataset.getDims();
The values of dims are [3, 2], which is correct.

Object data = dataset.read();
The Array.getLength(data) is 2, and I expect 6 to read my strings.

There is no problem for dataset with other types, with integer for
example.

I'm using hdf-java-2.7 and HDFView Version 2.7.

Thanks

Guillaume

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org