Strict format checks and compound/array types

Hi,

I'm having trouble with one of the files in my test suite... it was
originally created under HDF5 1.6 and consists of a single dataset,
with a compound type containing (among other things) an array type.
When this file is opened with HDF5 1.8.3 compiled with strict format
checks, an error ("Incorrect array datatype version") occurs and the
open fails. However, opening with 1.6.9 in strict format mode results
in no error, as does opening with either version without strict format
checking. I don't believe the file is corrupted; it originated with
the PyTables project and is distributed with both PyTables and h5py.
Any advice would be appreciated.

Here's a synopsis of the output of h5ls -vlr [file]:

With 1.6.9 (strict format check):
/CompoundChunked Dataset {6/6}
    Location: 0:1:0:800
    Links: 1
    Modified: 2009-09-30 11:41:27 PDT
    Storage: 1170 logical bytes, 1170 allocated bytes, 100.00% utilization
    Type: struct {
                   "a_name" +0 32-bit big-endian integer
                   "c_name" +4 6-byte null-padded ASCII string
                   "d_name" +10 [5,10] 16-bit big-endian integer
                   "e_name" +110 IEEE 32-bit big-endian float
                   "f_name" +114 [10] IEEE 64-bit big-endian float
                   "g_name" +194 native unsigned char
               } 195 bytes

With 1.8.3 (strict format check):
/ Group
    Location: 1:928
    Links: 1
/CompoundChunked Dataset *ERROR*

The (reformatted) HDF5 error stack is:
    0: "Incorrect array datatype version" at H5O_dtype_decode_helper
        Datatype :: Wrong version number
    1: "Unable to decode member type" at H5O_dtype_decode_helper
        Datatype :: Unable to decode value
    2: "Can't decode type" at H5O_dtype_decode
        Datatype :: Unable to decode value
    3: "Unable to decode native message" at H5O_dtype_shared_decode
        Object header :: Unable to decode value
    4: "Unable to decode message" at H5O_msg_read_oh
        Object header :: Unable to decode value
    5: "Unable to load object header" at H5O_msg_read
        Object header :: Read failed
    6: "Unable to load type info from dataset header" at H5D_open_oid
        Dataset :: Unable to initialize object
    7: "Not found" at H5D_open
        Dataset :: Object not found
    8: "Can't open dataset" at H5Dopen1
        Dataset :: Unable to initialize object

Andrew Collette

smpl_compound_chunked.h5 (7.59 KB)

Andrew,

This was cause by a bug in HDF5 before version 1.6.8 which would write the incorrect version number for array types within a compound. Opening the file in read-write mode with 1.8.2 or later and strict format checks disabled, then opening the problematic dataset should fix the problem permanently (for that dataset). There should be no backward compatibility issues with the resulting file (other than the library must be new enough to understand arrays - 1.4.0 I believe).

-Neil

···

On 09/30/2009 03:47 PM, Andrew Collette wrote:

Hi,

I'm having trouble with one of the files in my test suite... it was
originally created under HDF5 1.6 and consists of a single dataset,
with a compound type containing (among other things) an array type.
When this file is opened with HDF5 1.8.3 compiled with strict format
checks, an error ("Incorrect array datatype version") occurs and the
open fails. However, opening with 1.6.9 in strict format mode results
in no error, as does opening with either version without strict format
checking. I don't believe the file is corrupted; it originated with
the PyTables project and is distributed with both PyTables and h5py.
Any advice would be appreciated.

Here's a synopsis of the output of h5ls -vlr [file]:

With 1.6.9 (strict format check):
/CompoundChunked Dataset {6/6}
     Location: 0:1:0:800
     Links: 1
     Modified: 2009-09-30 11:41:27 PDT
     Storage: 1170 logical bytes, 1170 allocated bytes, 100.00% utilization
     Type: struct {
                    "a_name" +0 32-bit big-endian integer
                    "c_name" +4 6-byte null-padded ASCII string
                    "d_name" +10 [5,10] 16-bit big-endian integer
                    "e_name" +110 IEEE 32-bit big-endian float
                    "f_name" +114 [10] IEEE 64-bit big-endian float
                    "g_name" +194 native unsigned char
                } 195 bytes

With 1.8.3 (strict format check):
/ Group
     Location: 1:928
     Links: 1
/CompoundChunked Dataset *ERROR*

The (reformatted) HDF5 error stack is:
     0: "Incorrect array datatype version" at H5O_dtype_decode_helper
         Datatype :: Wrong version number
     1: "Unable to decode member type" at H5O_dtype_decode_helper
         Datatype :: Unable to decode value
     2: "Can't decode type" at H5O_dtype_decode
         Datatype :: Unable to decode value
     3: "Unable to decode native message" at H5O_dtype_shared_decode
         Object header :: Unable to decode value
     4: "Unable to decode message" at H5O_msg_read_oh
         Object header :: Unable to decode value
     5: "Unable to load object header" at H5O_msg_read
         Object header :: Read failed
     6: "Unable to load type info from dataset header" at H5D_open_oid
         Dataset :: Unable to initialize object
     7: "Not found" at H5D_open
         Dataset :: Object not found
     8: "Can't open dataset" at H5Dopen1
         Dataset :: Unable to initialize object

Andrew Collette
   ------------------------------------------------------------------------

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org