How to write variable length UTF8 attributes using jhdf5?

Hi,

I've seen the Java example for variable length string datasets, but I just want to write some attributes. What I'm trying to do is create files with UTF8 string attributes that can be easily read by h5py. But the only way h5py will work with UTF8 attributes is if they're variable length.

The code below works fine for a fixed length attribute, but crashes the JVM when I set the dtype size to H5T_VARIABLE.

import java.io.File;
import static ncsa.hdf.hdf5lib.H5.*;
import static ncsa.hdf.hdf5lib.HDF5Constants.*;
import org.apache.log4j.BasicConfigurator;

public class WriteVariableUTF8 {

    public static void main(String[] args) throws Exception {
        BasicConfigurator.configure(); // log4j.

        byte[] bytes = "world".getBytes("UTF-8");

        File path = new File("C:/Users/dan/variableutf8.h5");
        int file = H5Fcreate(path.getAbsolutePath(), H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

        int dtype = H5Tcopy(H5T_C_S1);
        H5Tset_cset(dtype, H5T_CSET_UTF8);
        H5Tset_size(dtype, H5T_VARIABLE); // works if I set size to bytes.length.
        H5Tset_strpad(dtype, H5T_STR_NULLTERM);

        int space = H5Screate(H5S_SCALAR);
        int attr = H5Acreate(file, "hello", dtype, space, H5P_DEFAULT, H5P_DEFAULT);
        H5Awrite(attr, dtype, bytes);

        H5Aclose(attr);
        H5Sclose(space);
        H5Fclose(file);
    }
}

Thanks for your help,

Dan Tetlow

Yes, there is support for writing VL strings in datasets, but not in
attributes. Also there is no support for writing VL non-strings for datasets
or attributes. There is an existing issue (JAVA-1833) in our bug database for
these issues and I will add this request for attribute support.

Allen

···

On Wednesday, April 16, 2014 12:06:41 PM Daniel Tetlow wrote:

Hi,

I've seen the Java example for variable length string datasets, but I just
want to write some attributes. What I'm trying to do is create files with
UTF8 string attributes that can be easily read by h5py. But the only way
h5py will work with UTF8 attributes is if they're variable length.

The code below works fine for a fixed length attribute, but crashes the JVM
when I set the dtype size to H5T_VARIABLE.

import java.io.File;
import static ncsa.hdf.hdf5lib.H5.*;
import static ncsa.hdf.hdf5lib.HDF5Constants.*;
import org.apache.log4j.BasicConfigurator;

public class WriteVariableUTF8 {

    public static void main(String[] args) throws Exception {
        BasicConfigurator.configure(); // log4j.

        byte[] bytes = "world".getBytes("UTF-8");

        File path = new File("C:/Users/dan/variableutf8.h5");
        int file = H5Fcreate(path.getAbsolutePath(), H5F_ACC_TRUNC,
H5P_DEFAULT, H5P_DEFAULT);

        int dtype = H5Tcopy(H5T_C_S1);
        H5Tset_cset(dtype, H5T_CSET_UTF8);
        H5Tset_size(dtype, H5T_VARIABLE); // works if I set size to
bytes.length. H5Tset_strpad(dtype, H5T_STR_NULLTERM);

        int space = H5Screate(H5S_SCALAR);
        int attr = H5Acreate(file, "hello", dtype, space, H5P_DEFAULT,
H5P_DEFAULT); H5Awrite(attr, dtype, bytes);

        H5Aclose(attr);
        H5Sclose(space);
        H5Fclose(file);
    }
}

Thanks for your help,

Dan Tetlow