HDF5FunctionArgumentException: Inappropriate type for Datatype.CLASS_ARRAY


#1

I am currently trying to get acquainted with the creation of compound datasets and ran into a question. In the end I need to write something like the following:

image

To get things started I tried creating a compount dataset of primitives (based on Link) and this worked fine:

@Test
public void testCompoundDataSet() throws Exception{
    
    String fileName = "./test.h5";
    FileFormat fileFormat = FileFormat.getFileFormat(FileFormat.FILE_TYPE_HDF5);
    H5File file = (H5File) fileFormat.createFile(fileName, FileFormat.FILE_CREATE_DELETE);
    
    // This is the total amount of entries for each type in the compound
    int DIM_SIZE = 500;
    
    // Arbitrary string length
    int STR_LEN = 20;
    
    String message = "";
    Group pgroup = null;
    
    // We create dummy arrays with default values
    int[] DATA_INT = new int[DIM_SIZE];
    float[] DATA_FLOAT = new float[DIM_SIZE];
    String[] DATA_STR = new String[DIM_SIZE];
    
    // We set the dimensions we want to store the data in - the product of all values has to be the dimension size
    long[] DIMs = { 500, 1 };
    
    // Chunks the data is stored in
    long[] CHUNKs = { 50, 1 };
    
    // Set up the data object
    Vector<Object> data = new Vector<>();
    data.add(DATA_INT);
    data.add(DATA_FLOAT);
    data.add(DATA_STR);

    // create groups
    String[] mnames = { "int", "float", "string" };
    Datatype[] mdtypes = new H5Datatype[3];
    
    mdtypes[0] = new H5Datatype(Datatype.CLASS_INTEGER, 4, Datatype.NATIVE, Datatype.NATIVE);
    mdtypes[1] = new H5Datatype(Datatype.CLASS_FLOAT, 4, Datatype.NATIVE, Datatype.NATIVE);
    mdtypes[2] = new H5Datatype(Datatype.CLASS_STRING, STR_LEN, Datatype.NATIVE, Datatype.NATIVE);
    Dataset dset = file.createCompoundDS("/CompoundDS", pgroup, DIMs, null, CHUNKs, 9, mnames, mdtypes, null, data);
}

In the next step I tried to create a compount dataset with a single integer array column. Unfortunately, I get a HDF5FunctionArgumentException

hdf.hdf5lib.exceptions.HDF5FunctionArgumentException: Inappropriate type
  at hdf.hdf5lib.H5.H5Tget_size(Native Method)
  at hdf.object.h5.H5CompoundDS.create(H5CompoundDS.java:1972)
  at hdf.object.h5.H5File.createCompoundDS(H5File.java:1660)

for this code

@Test
public void testCompoundDataSetArray() throws Exception{
    
    String fileName = "./test.h5";
    FileFormat fileFormat = FileFormat.getFileFormat(FileFormat.FILE_TYPE_HDF5);
    H5File file = (H5File) fileFormat.createFile(fileName, FileFormat.FILE_CREATE_DELETE);
    
    // This is the total amount of entries for each type in the compound
    int DIM_SIZE = 2;        
    
    String message = "";
    Group pgroup = null;
    
    // We create dummy arrays with default values
    int dim2 = 2;
    int[][] DATA_INT = new int[DIM_SIZE][dim2];// { new int[]{1,2}, new int[]{3,4} };
    for (int ii = 0; ii < DIM_SIZE; ii++){
        for (int jj = 0; jj < dim2; jj++){
            DATA_INT[ii][jj] = RandomUtils.randomInt(0,10);
        }
    }
    
    // We set the dimensions we want to store the data in - the product of all values has to be the dimension size
    long[] DIMs = { DIM_SIZE, 1 };
    
    // Chunks the data is stored in
    long[] CHUNKs = { 1, 1 };
    
    // Set up the data object
    Vector<Object> data = new Vector<>();
    data.add(DATA_INT);

    // create groups
    String[] mnames = { "int" };
    Datatype[] mdtypes = new H5Datatype[1];
    
    H5Datatype intValueDataType = (H5Datatype)file.createDatatype(Datatype.CLASS_INTEGER, 4, Datatype.NATIVE, Datatype.NATIVE);
    H5Datatype intArrayDataType = (H5Datatype)file.createDatatype(Datatype.CLASS_ARRAY, Datatype.NATIVE, Datatype.NATIVE, Datatype.NATIVE, intValueDataType);
    
    mdtypes[0] = intArrayDataType;
    
    int[] msizes = new int[]{2};
    
    Dataset dset = file.createCompoundDS("/CompoundDS", pgroup, DIMs, null, CHUNKs, 9, mnames, mdtypes, msizes, data);
}

So I guess there is something wrong with the way I define H5Datatype intArrayDataType. Can anyone please explain what I am missing?

Environment: HDFView 3.1.3, jarhdf5-1.10.7.jar, jarhdf-4.2.15.jar and the respective hdfobject.jar


#2

Hi @martin.raedel,

Would you mind to tell the names, data types and dimensions of all the members that compose your compound dataset? This info may help further understanding what you are aiming for and eventually provide you with a solution.


#3

Sure, so for the simple test of writing a compound dataset with a single integer array “column” these are 32bit integers. In the test method I aimed for the creation of 2 entries of int[2] so the compound dataset should have finally looked something like that:

To be honest, I did not quite get how to create the array data type, as the example in the API description does not really mention it and I could not find another decent example that does not rely on JNI. My thought process was to create a 32bit integer type (intValueDataType in my code) as tbase Datatype for use in the second call to createDataType (for intArrayDataType in my code). I’d like to understand what the right approach for the creation of the array datatype is with arbitrary base types.

From there the next step for me would be the more complex example from the first figure in the original question with the following definition:


#4

Ok, so I got it working without even requiring the DataType.CLASS_ARRAY. For sake of completeness, this is what I came up with:

@Test
public void testCompoundDataSetArray() throws Exception{
    
    String fileName = "./test.h5";
    FileFormat fileFormat = FileFormat.getFileFormat(FileFormat.FILE_TYPE_HDF5);
    H5File file = (H5File) fileFormat.createFile(fileName, FileFormat.FILE_CREATE_DELETE);
    
    // This is the total amount of entries for each type in the compound
    int DIM_SIZE = 2;        
    
    String message = "";
    Group pgroup = null;
    
    // We create dummy arrays with default values
    int dim2 = 2;
    int[][] DATA_INT = new int[DIM_SIZE][dim2];
    for (int ii = 0; ii < DIM_SIZE; ii++){
        for (int jj = 0; jj < dim2; jj++){
                DATA_INT[ii][jj] = RandomUtils.randomInt(0,10);
            }
        }
    }
    
    // We set the dimensions we want to store the data in - the product of all values has to be the dimension size
    long[] DIMs = { DIM_SIZE, 1 };
    
    // Chunks the data is stored in
    long[] CHUNKs = { 1, 1 };
    
    // Set up the data object
    Vector<Object> data = new Vector<>();
    data.add(DATA_INT);

    // create groups
    String[] mnames = { "int" };
    Datatype[] mdtypes = new H5Datatype[1];
    
    H5Datatype intValueDataType = (H5Datatype)file.createDatatype(Datatype.CLASS_INTEGER, 4, Datatype.NATIVE, Datatype.NATIVE);
    
    mdtypes[0] = intValueDataType;
    
    int[] msizes = new int[]{2};
    
    Dataset dset = file.createCompoundDS("/CompoundDS", pgroup, DIMs, null, CHUNKs, 9, mnames, mdtypes, msizes, data);
}

which results in

However, I would still be interested in understanding what the Datatype.CLASS_ARRAY is for, if anyone wants to elaborate :wink:


#5

Martin, how are you? The purpose of H5T_ARRAY datatypes is to represent dataset elements whose values are fixed-size, multi-dimensional arrays. Let’s say I want to record positions and speeds in a 3D cartesian reference system, then it might be convenient to represent them as two datasets whose element type is of class H5T_ARRAY. Obviously, there are many other possible representations. Instead of having a 1D dataset of an H5T_ARRAY-type, you could have 2D floating-point datasets of 3xN or Nx3 where the rows or columns (3) represent the different coordinate axes. You could also create elaborate compound datatypes, groups of columns, etc.

All of that is your choice, and, in some cases, there are profound performance implications. As always, apply the KISS principle! Be practical and try to make your data products accessible from most tools in the wider ecosystem. (… and the creation of complex or exotic user-defined datatypes is one of the surest ways to paint yourself into a corner.)

OK?
G.


#6

Got it, thx. I did miss the fact that it is solely for multi-dimensional arrays.


#7

Yes, but one-dimensional arrays are also included in my definition of multi-dimensional arrays. In the position example, it’s a 3-vector. If you want to emphasize that it’s a row or column, you can make it 3x1 or 1x3, but that’s cosmetical. G.