Expanding compound dataset - Java

I created a compound dataset using the following api.

        file = new H5File(FILENAME, FileFormat.CREATE);
        file.open();

        long[] dims = {10000, 10};
        long[] maxdims = {-1, -1};
        long[] chunks = {100, 10};
        int gzip = 4;

        Dataset d = file.createCompoundDS(FILENAME, null, dims,
maxdims, chunks, gzip, memberNames, memberDatatypes, memberSizes, data);

Now, a dataset "d" of size 10000x10 is created. How to extend the created
dataset?
(ie.) I want to add 10000 more rows to the dataset.

I tried using,

        long[] start = dataset.getStartDims();
        long[] sizes = dataset.getSelectedDims();

        start[0] = 10001;
        start[1] = 0;

        sizes[0] = 20001;
        sizes[1] = 10;

        dataset.write(new_data);

I expected the dataset to get extended (from rows size to 10000 to 20000).
But instead the already written dataset is rewritten with the new_data. :frowning:

May I know what I am missing here? How to extend a compound dataset?

Thanks in advance.

Kalpa

Hi Kalpa,
Being “compound “ doesn’t really matter—that’s just the data type and size needed per record. Here is how I extend datasets—compound or otherwise.

1) Call H5D.setExtent to enlarge the extents. (http://www.hdfgroup.org/HDF5/doc/RM/RM_H5D.html#Dataset-SetExtent)

2) Close your dataspace and get it again to make sure you have the new extents

3) Use a hyperslab (H5S.selectHyperslab) to specify the destination (may be other ways, but this works for me)

4) Call H5.write – using the overload with parameters for your hyperslab dataspace. It is this dataspace that is going to tell write how/where to write the data back to the dataset.

Below is a generic function I wrote (c#, but should be similar in Java) to “append” data that extends the 1st dimension of an n-dimension array (particular to the data I work with).
The other thing I did when using compound types was to write a static method that returns a H5DataType id for the structure. You could also store this in the hdf5 file, but I haven’t done that.

Warm Regards,
Jim

private void AppendDataAny<T>(string dataSetPath, Array theData)
        {
#if !HDFMT
            lock (_staticObjectLock)
            {
#endif
                CheckOpenFile();
                if (theData.Length == 0 || theData.GetUpperBound(0) == 0 || theData.Rank > 32) return;

                int rank = theData.Rank;
                H5DataSetId dataSetId = H5D.open(_H5FileId, dataSetPath); //Get the dataSetID based off of the data name.
                H5DataSpaceId dataSpaceId = H5D.getSpace(dataSetId); //get the dataSpaceID of the dataSetID
                long[] existingDataExtents = H5S.getSimpleExtentDims(dataSpaceId);
                long[] newDataDims = theData.GetSize();
                long[] newDataExtents = theData.GetSize();
                newDataExtents[0] = existingDataExtents[0] + newDataDims[0];

                H5D.setExtent(dataSetId, newDataExtents);

                H5S.close(dataSpaceId);
                dataSpaceId = H5D.getSpace(dataSetId);

                //Allocate some memory space to hold the new data in memory during I/O
                H5DataSpaceId memSpaceId = H5S.create_simple(rank, newDataDims);

                //Select the area of data that we'll want to write to in the data file using a 'hyperslab'
                long[] offset = new long[rank];
                offset[0] = existingDataExtents[0];
                long[] count = newDataDims;
                H5S.selectHyperslab(dataSpaceId, H5S.SelectOperator.SET, offset, count);

                //Now that we have all the pre-work figured out, we can actually write some data out.
                H5DataTypeId dataTypeId = GetH5NativeType(typeof (T));
                H5D.write(dataSetId, dataTypeId, memSpaceId, dataSpaceId, _defaultPropsId, new H5Array<T>(theData));

                _log.DebugFormatIfEnabled("Appended data to: {0}. Offset: {1}. Data dims: {2}.", dataSetPath, existingDataExtents[0], newDataDims.PrintValues());

                //Cleanup
                CloseStandardTypeIds<T>(dataTypeId, memSpaceId, dataSpaceId, dataSetId);
#if !HDFMT
            }
#endif
        }

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Karpaga Rajadurai
Sent: Friday, May 2, 2014 4:31 AM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] Expanding compound dataset - Java

I created a compound dataset using the following api.

        file = new H5File(FILENAME, FileFormat.CREATE);
        file.open();

        long[] dims = {10000, 10};
        long[] maxdims = {-1, -1};
        long[] chunks = {100, 10};
        int gzip = 4;

        Dataset d = file.createCompoundDS(FILENAME, null, dims, maxdims, chunks, gzip, memberNames, memberDatatypes, memberSizes, data);

Now, a dataset "d" of size 10000x10 is created. How to extend the created dataset?
(ie.) I want to add 10000 more rows to the dataset.

I tried using,

        long[] start = dataset.getStartDims();
        long[] sizes = dataset.getSelectedDims();

        start[0] = 10001;
        start[1] = 0;

        sizes[0] = 20001;
        sizes[1] = 10;

        dataset.write(new_data);

I expected the dataset to get extended (from rows size to 10000 to 20000).
But instead the already written dataset is rewritten with the new_data. :frowning:

May I know what I am missing here? How to extend a compound dataset?

Thanks in advance.

Kalpa