H5.H5Pset_deflate results in larger file size

Hi,

I'm trying to write several array of Strings with compression using the
lower level H5 functions. However, I'm finding that my files are larger
with compression. Has anyone experienced this? I wasn't sure how to chunk
an array of length 8 so i tried cutting it in half.

I copied and pasted parts of my code below:

String[] array = new String[8];
....
long[] dims = { array.length };
long[] chunk_dims = {array.length/2};
...
// create type id
    try {
      type_id = H5.H5Tcopy(HDF5Constants.H5T_C_S1);
          H5.H5Tset_size(type_id, HDF5Constants.H5T_VARIABLE);
      }
      catch (Exception e) {
          e.printStackTrace();
      }
       
      // Create the dataset creation property list, add the gzip compression
      // filter.
      try {
        dcpl_id = H5.H5Pcreate(HDF5Constants.H5P_DATASET_CREATE);
          if (dcpl_id >= 0) {
            H5.H5Pset_deflate(dcpl_id, 9);
              // Set the chunk size.
              H5.H5Pset_chunk(dcpl_id, 1, chunk_dims);
          }
      }
      catch (Exception e) {
        e.printStackTrace();
      }
try {
        if ( (dataspace_id >= 0)) {
          dataset_id = H5.H5Dcreate(file_id, NAME,
                     type_id, dataspace_id,
                     HDF5Constants.H5P_DEFAULT, dcpl_id,
HDF5Constants.H5P_DEFAULT);
         
        }
      }
      catch (Exception e) {
        e.printStackTrace();
      }

···

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/H5-H5Pset-deflate-results-in-larger-file-size-tp4025851.html
Sent from the hdf-forum mailing list archive at Nabble.com.

compression does not work well on variable length data. the data is larger because
of the overhead from storing the compression information.

···

On 2/5/2013 4:02 PM, heatherk wrote:

Hi,

I'm trying to write several array of Strings with compression using the
lower level H5 functions. However, I'm finding that my files are larger
with compression. Has anyone experienced this? I wasn't sure how to chunk
an array of length 8 so i tried cutting it in half.

I copied and pasted parts of my code below:

String[] array = new String[8];
....
long[] dims = { array.length };
long[] chunk_dims = {array.length/2};
...
// create type id
    try {
      type_id = H5.H5Tcopy(HDF5Constants.H5T_C_S1);
          H5.H5Tset_size(type_id, HDF5Constants.H5T_VARIABLE);
      }
      catch (Exception e) {
          e.printStackTrace();
      }
  
      // Create the dataset creation property list, add the gzip compression
      // filter.
      try {
        dcpl_id = H5.H5Pcreate(HDF5Constants.H5P_DATASET_CREATE);
          if (dcpl_id >= 0) {
            H5.H5Pset_deflate(dcpl_id, 9);
              // Set the chunk size.
              H5.H5Pset_chunk(dcpl_id, 1, chunk_dims);
          }
      }
      catch (Exception e) {
        e.printStackTrace();
      }
try {
        if ( (dataspace_id >= 0)) {
          dataset_id = H5.H5Dcreate(file_id, NAME,
                     type_id, dataspace_id,
                     HDF5Constants.H5P_DEFAULT, dcpl_id,
HDF5Constants.H5P_DEFAULT);
  
        }
      }
      catch (Exception e) {
        e.printStackTrace();
      }

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/H5-H5Pset-deflate-results-in-larger-file-size-tp4025851.html
Sent from the hdf-forum mailing list archive at Nabble.com.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Indeed,

Variable length data does not get compressed with HDF5 at this point. Some
earlier posts in this forum already mention this.

Actually variable data is stored elsewhere and only the references are
compressed. You have two solutions as explained in this previous post:

http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2011-October/005136.html

Good luck!

···

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/H5-H5Pset-deflate-results-in-larger-file-size-tp4025851p4025853.html
Sent from the hdf-forum mailing list archive at Nabble.com.