Endian issue with fractal heap ID for managed objects

Hello,

I have a strange error where HDF5 files written with NetCDF4 and HDF5-1.8.4 on a big endian machine (Power6) could not be read on little endian and vice versa. This isn't a NetCDF issue since pure HDF5 files written with the attached program show the same behaviour. It turned out that the "Fractal Heap ID for Managed Objects" in the "Version 2 B-tree, Type 8 Record Layout" isn't correctly written on big endian. The ID is constructed from version, offset, and length with the macro

#define H5HF_MAN_ID_ENCODE(i, h, o, l) \
     *(i) = H5HF_ID_VERS_CURR | H5HF_ID_TYPE_MAN; \
     (i)++; \
     UINT64ENCODE_VAR((i), (o), (h)->heap_off_size); \
     UINT64ENCODE_VAR((i), (l), (h)->heap_len_size)

so offset and length are already encoded. Prior to actually writing the ID, it is encoded again, though:

In H5A_dense_btree2_name_encode():

UINT64ENCODE(raw, nrecord->id);

And this breaks on big endian. Just copying nrecord->id to raw works for me but I don't know if this covers all use cases of H5A_dense_btree2_name_encode().

Cheers,
Mathis

endian_error.c (1.17 KB)

···

--
Mathis Rosenhauer
Application support
German Climate Computing Center

Mathis,

Thank you for your report. We are aware of the issue and working on it. Your program example is very helpful.

Thank you!

Elena

···

On Feb 17, 2010, at 6:55 AM, Mathis Rosenhauer wrote:

Hello,

I have a strange error where HDF5 files written with NetCDF4 and HDF5-1.8.4 on a big endian machine (Power6) could not be read on little endian and vice versa. This isn't a NetCDF issue since pure HDF5 files written with the attached program show the same behaviour. It turned out that the "Fractal Heap ID for Managed Objects" in the "Version 2 B-tree, Type 8 Record Layout" isn't correctly written on big endian. The ID is constructed from version, offset, and length with the macro

#define H5HF_MAN_ID_ENCODE(i, h, o, l) \
   *(i) = H5HF_ID_VERS_CURR | H5HF_ID_TYPE_MAN; \
   (i)++; \
   UINT64ENCODE_VAR((i), (o), (h)->heap_off_size); \
   UINT64ENCODE_VAR((i), (l), (h)->heap_len_size)

so offset and length are already encoded. Prior to actually writing the ID, it is encoded again, though:

In H5A_dense_btree2_name_encode():

UINT64ENCODE(raw, nrecord->id);

And this breaks on big endian. Just copying nrecord->id to raw works for me but I don't know if this covers all use cases of H5A_dense_btree2_name_encode().

Cheers,
Mathis

--
Mathis Rosenhauer
Application support
German Climate Computing Center
#include "hdf5.h"

int main (void)
{
  hid_t file, dataset, fid, str_id, attribute, stype;
   hsize_t dimsf[1];
   herr_t ret;
   float time[]={0.0};
  hid_t fapl_id;
  int i;
  char string[6];

  fapl_id = H5Pcreate(H5P_FILE_ACCESS);
  /* only happens with V2 */
  H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);

  file = H5Fcreate("out.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);

  dimsf[0] = 1;
  fid = H5Screate_simple(1, dimsf, NULL);
  dataset = H5Dcreate2(file, "data", H5T_NATIVE_FLOAT, fid, H5P_DEFAULT, H5P_DEFAULT,H5P_DEFAULT);
  ret = H5Dwrite(dataset, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL, H5P_DEFAULT, time);

  /* We need at least 9 attributes to trigger the error */
  for (i=0; i<9; i++)
  {
    
    str_id = H5Screate(H5S_SCALAR);
    stype = H5Tcopy(H5T_C_S1);
    H5Tset_size(stype, 4);
    H5Tset_strpad(stype,H5T_STR_NULLTERM);
    sprintf(string, "attr%i", i);
    attribute = H5Acreate2(dataset, string, stype, str_id, H5P_DEFAULT, H5P_DEFAULT);
    ret = H5Awrite(attribute, stype, "1234");
    ret = H5Sclose(str_id);
    ret = H5Aclose(attribute);
    H5Tclose(stype);
  }

  H5Sclose(fid);
   H5Dclose(dataset);
   H5Fclose(file);

   return 0;
}
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Mathis,

Hello,

I have a strange error where HDF5 files written with NetCDF4 and HDF5-1.8.4 on a big endian machine (Power6) could not be read on little endian and vice versa. This isn't a NetCDF issue since pure HDF5 files written with the attached program show the same behaviour. It turned out that the "Fractal Heap ID for Managed Objects" in the "Version 2 B-tree, Type 8 Record Layout" isn't correctly written on big endian. The ID is constructed from version, offset, and length with the macro

#define H5HF_MAN_ID_ENCODE(i, h, o, l) \
   *(i) = H5HF_ID_VERS_CURR | H5HF_ID_TYPE_MAN; \
   (i)++; \
   UINT64ENCODE_VAR((i), (o), (h)->heap_off_size); \
   UINT64ENCODE_VAR((i), (l), (h)->heap_len_size)

so offset and length are already encoded. Prior to actually writing the ID, it is encoded again, though:

In H5A_dense_btree2_name_encode():

UINT64ENCODE(raw, nrecord->id);

And this breaks on big endian. Just copying nrecord->id to raw works for me but I don't know if this covers all use cases of H5A_dense_btree2_name_encode().

  Hmm, this may be related, but probably isn't the root cause of this problem, since there's a corresponding decode macro for the ID in H5A_dense_btree2_name_decode(). (And the H5HF_MAN_ID_DECODE() macro looks correct also).

  Quincey

···

On Feb 17, 2010, at 6:55 AM, Mathis Rosenhauer wrote:

Cheers,
Mathis

--
Mathis Rosenhauer
Application support
German Climate Computing Center
#include "hdf5.h"

int main (void)
{
  hid_t file, dataset, fid, str_id, attribute, stype;
   hsize_t dimsf[1];
   herr_t ret;
   float time[]={0.0};
  hid_t fapl_id;
  int i;
  char string[6];

  fapl_id = H5Pcreate(H5P_FILE_ACCESS);
  /* only happens with V2 */
  H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);

  file = H5Fcreate("out.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);

  dimsf[0] = 1;
  fid = H5Screate_simple(1, dimsf, NULL);
  dataset = H5Dcreate2(file, "data", H5T_NATIVE_FLOAT, fid, H5P_DEFAULT, H5P_DEFAULT,H5P_DEFAULT);
  ret = H5Dwrite(dataset, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL, H5P_DEFAULT, time);

  /* We need at least 9 attributes to trigger the error */
  for (i=0; i<9; i++)
  {
    
    str_id = H5Screate(H5S_SCALAR);
    stype = H5Tcopy(H5T_C_S1);
    H5Tset_size(stype, 4);
    H5Tset_strpad(stype,H5T_STR_NULLTERM);
    sprintf(string, "attr%i", i);
    attribute = H5Acreate2(dataset, string, stype, str_id, H5P_DEFAULT, H5P_DEFAULT);
    ret = H5Awrite(attribute, stype, "1234");
    ret = H5Sclose(str_id);
    ret = H5Aclose(attribute);
    H5Tclose(stype);
  }

  H5Sclose(fid);
   H5Dclose(dataset);
   H5Fclose(file);

   return 0;
}
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Mathis,

Hi Mathis,

Hello,

I have a strange error where HDF5 files written with NetCDF4 and HDF5-1.8.4 on a big endian machine (Power6) could not be read on little endian and vice versa. This isn't a NetCDF issue since pure HDF5 files written with the attached program show the same behaviour. It turned out that the "Fractal Heap ID for Managed Objects" in the "Version 2 B-tree, Type 8 Record Layout" isn't correctly written on big endian. The ID is constructed from version, offset, and length with the macro

#define H5HF_MAN_ID_ENCODE(i, h, o, l) \
  *(i) = H5HF_ID_VERS_CURR | H5HF_ID_TYPE_MAN; \
  (i)++; \
  UINT64ENCODE_VAR((i), (o), (h)->heap_off_size); \
  UINT64ENCODE_VAR((i), (l), (h)->heap_len_size)

so offset and length are already encoded. Prior to actually writing the ID, it is encoded again, though:

In H5A_dense_btree2_name_encode():

UINT64ENCODE(raw, nrecord->id);

And this breaks on big endian. Just copying nrecord->id to raw works for me but I don't know if this covers all use cases of H5A_dense_btree2_name_encode().

  Hmm, this may be related, but probably isn't the root cause of this problem, since there's a corresponding decode macro for the ID in H5A_dense_btree2_name_decode(). (And the H5HF_MAN_ID_DECODE() macro looks correct also).

  On second thought, I think you are correct, this is the problem. Sorry for my earlier response - we are working on putting a fix into the library right now.

  Quincey

···

On Feb 18, 2010, at 10:13 AM, Quincey Koziol wrote:

On Feb 17, 2010, at 6:55 AM, Mathis Rosenhauer wrote:

  Quincey

Cheers,
Mathis

--
Mathis Rosenhauer
Application support
German Climate Computing Center
#include "hdf5.h"

int main (void)
{
  hid_t file, dataset, fid, str_id, attribute, stype;
  hsize_t dimsf[1];
  herr_t ret;
  float time[]={0.0};
  hid_t fapl_id;
  int i;
  char string[6];

  fapl_id = H5Pcreate(H5P_FILE_ACCESS);
  /* only happens with V2 */
  H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);

  file = H5Fcreate("out.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id);

  dimsf[0] = 1;
  fid = H5Screate_simple(1, dimsf, NULL);
  dataset = H5Dcreate2(file, "data", H5T_NATIVE_FLOAT, fid, H5P_DEFAULT, H5P_DEFAULT,H5P_DEFAULT);
  ret = H5Dwrite(dataset, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL, H5P_DEFAULT, time);

  /* We need at least 9 attributes to trigger the error */
  for (i=0; i<9; i++)
  {
    
    str_id = H5Screate(H5S_SCALAR);
    stype = H5Tcopy(H5T_C_S1);
    H5Tset_size(stype, 4);
    H5Tset_strpad(stype,H5T_STR_NULLTERM);
    sprintf(string, "attr%i", i);
    attribute = H5Acreate2(dataset, string, stype, str_id, H5P_DEFAULT, H5P_DEFAULT);
    ret = H5Awrite(attribute, stype, "1234");
    ret = H5Sclose(str_id);
    ret = H5Aclose(attribute);
    H5Tclose(stype);
  }

  H5Sclose(fid);
  H5Dclose(dataset);
  H5Fclose(file);

  return 0;
}
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Quincey,

Quincey Koziol wrote:

Hi Mathis,

Hi Mathis,

Hello,

I have a strange error where HDF5 files written with NetCDF4 and HDF5-1.8.4 on a big endian machine (Power6) could not be read on little endian and vice versa. This isn't a NetCDF issue since pure HDF5 files written with the attached program show the same behaviour. It turned out that the "Fractal Heap ID for Managed Objects" in the "Version 2 B-tree, Type 8 Record Layout" isn't correctly written on big endian. The ID is constructed from version, offset, and length with the macro

#define H5HF_MAN_ID_ENCODE(i, h, o, l) \
  *(i) = H5HF_ID_VERS_CURR | H5HF_ID_TYPE_MAN; \
  (i)++; \
  UINT64ENCODE_VAR((i), (o), (h)->heap_off_size); \
  UINT64ENCODE_VAR((i), (l), (h)->heap_len_size)

so offset and length are already encoded. Prior to actually writing the ID, it is encoded again, though:

In H5A_dense_btree2_name_encode():

UINT64ENCODE(raw, nrecord->id);

And this breaks on big endian. Just copying nrecord->id to raw works for me but I don't know if this covers all use cases of H5A_dense_btree2_name_encode().

  Hmm, this may be related, but probably isn't the root cause of this problem, since there's a corresponding decode macro for the ID in H5A_dense_btree2_name_decode(). (And the H5HF_MAN_ID_DECODE() macro looks correct also).

  On second thought, I think you are correct, this is the problem. Sorry for my earlier response - we are working on putting a fix into the library right now.

It works perfetly right now if you stay on the same architecture because the de/encoding is symmetric ;). One problem with a fix will be that you won't be able to read old files from BE machines with newer hdf5 releases. People here who are preparing the CMIP5 runs (they discovered the problem) tell me that they don't yet have hdf5 files they want to keep. But that's probably an exception. So it might be good to have a tool to fix the old files.

Cheers,
Mathis

···

On Feb 18, 2010, at 10:13 AM, Quincey Koziol wrote:

On Feb 17, 2010, at 6:55 AM, Mathis Rosenhauer wrote:

  Quincey

Hi Mathis,

Hi Quincey,

Quincey Koziol wrote:

Hi Mathis,

Hi Mathis,

Hello,

I have a strange error where HDF5 files written with NetCDF4 and HDF5-1.8.4 on a big endian machine (Power6) could not be read on little endian and vice versa. This isn't a NetCDF issue since pure HDF5 files written with the attached program show the same behaviour. It turned out that the "Fractal Heap ID for Managed Objects" in the "Version 2 B-tree, Type 8 Record Layout" isn't correctly written on big endian. The ID is constructed from version, offset, and length with the macro

#define H5HF_MAN_ID_ENCODE(i, h, o, l) \
*(i) = H5HF_ID_VERS_CURR | H5HF_ID_TYPE_MAN; \
(i)++; \
UINT64ENCODE_VAR((i), (o), (h)->heap_off_size); \
UINT64ENCODE_VAR((i), (l), (h)->heap_len_size)

so offset and length are already encoded. Prior to actually writing the ID, it is encoded again, though:

In H5A_dense_btree2_name_encode():

UINT64ENCODE(raw, nrecord->id);

And this breaks on big endian. Just copying nrecord->id to raw works for me but I don't know if this covers all use cases of H5A_dense_btree2_name_encode().

  Hmm, this may be related, but probably isn't the root cause of this problem, since there's a corresponding decode macro for the ID in H5A_dense_btree2_name_decode(). (And the H5HF_MAN_ID_DECODE() macro looks correct also).

  On second thought, I think you are correct, this is the problem. Sorry for my earlier response - we are working on putting a fix into the library right now.

It works perfetly right now if you stay on the same architecture because the de/encoding is symmetric ;).

  Yes, that's why we didn't catch it earlier. :frowning:

One problem with a fix will be that you won't be able to read old files from BE machines with newer hdf5 releases. People here who are preparing the CMIP5 runs (they discovered the problem) tell me that they don't yet have hdf5 files they want to keep. But that's probably an exception. So it might be good to have a tool to fix the old files.

  Yes, we're working on how to create a tool to reverse the problem (it's easy to fix, if you know that the problem exists) and also adding another layer of tests to prevent this sort of problem in the future.

  Thanks for your work in tracking it down,
    Quincey

···

On Feb 18, 2010, at 12:18 PM, Mathis Rosenhauer wrote:

On Feb 18, 2010, at 10:13 AM, Quincey Koziol wrote:

On Feb 17, 2010, at 6:55 AM, Mathis Rosenhauer wrote: