'addr overflow' in H5FD_sec2_read(); what are my options?

I've been using HDF5 for a while with reasonable success, but lately with full production runs I've been getting a 'addr overflow', which seems to be linked to the amount of allocated memory. Since these are files that I just created, and there were no problems reported during creation, I'm not sure what I need to do. The problems seem to occur with datasets created later in the file. The files are reasonably large (>30 GB), but not outrageous by HDF5 standards and can reasonably be expected to run up to .5 TB in the future, so I need to find a solution. Any insights you can give me into this problem and its resolution will be appreciated.

I'm using a precompiled Linux distribution of 1.8.2, via JHDF5. The OS on this machine recently changed to Centos from Fedora, if that matters. Representative stack trace reported by the HDF5 native code is as follows:

Exception while opening file /hdd2/arads/repo7/AtticaFDA_CI_COA1_262307.h5
HDF5-DIAG: Error detected in HDF5 (1.8.2) thread 0:
  #000: H5O.c line 242 in H5Oopen(): unable to open object
    major: Symbol table
    minor: Can't open object
  #001: H5O.c line 1290 in H5O_open_name(): unable to open object
    major: Symbol table
    minor: Can't open object
  #002: H5O.c line 1326 in H5O_open_by_loc(): unable to determine object class
    major: Object header
    minor: Can't get value
  #003: H5O.c line 1990 in H5O_obj_class(): unable to load object header
    major: Object header
    minor: Unable to load metadata into cache
  #004: H5AC.c line 1978 in H5AC_protect(): H5C_protect() failed.
    major: Object cache
    minor: Unable to protect metadata
  #005: H5C.c line 5942 in H5C_protect(): can't load entry
    major: Object cache
    minor: Unable to load metadata into cache
  #006: H5C.c line 10626 in H5C_load_entry(): unable to load entry
    major: Object cache
    minor: Unable to load metadata into cache
  #007: H5Ocache.c line 340 in H5O_load(): unable to read object header data
    major: Object header
    minor: Read failed
  #008: H5Fio.c line 109 in H5F_block_read(): read from metadata accumulator failed
    major: Low-level I/O
    minor: Read failed
  #009: H5Faccum.c line 227 in H5F_accum_read(): driver read request failed
    major: Low-level I/O
    minor: Read failed
  #010: H5FDint.c line 142 in H5FD_read(): driver read request failed
    major: Virtual File Layer
    minor: Read failed
  #011: H5FDsec2.c line 719 in H5FD_sec2_read(): addr overflow
    major: Invalid arguments to routine
    minor: Address overflowed
ncsa.hdf.hdf5lib.exceptions.HDF5FunctionArgumentException: Invalid arguments to routine:Address overflowed
        at ncsa.hdf.hdf5lib.H5.H5Oopen(Native Method)
        at ch.systemsx.cisd.hdf5.HDF5.openObject(HDF5.java:292)
        at ch.systemsx.cisd.hdf5.HDF5Reader$2.call(HDF5Reader.java:284)
        at ch.systemsx.cisd.hdf5.HDF5Reader$2.call(HDF5Reader.java:281)
        at ch.systemsx.cisd.hdf5.cleanup.CleanUpCallable.call(CleanUpCallable.java:40)
        at ch.systemsx.cisd.hdf5.HDF5Reader.getAllAttributeNames(HDF5Reader.java:290)
        at dynadrain.data.ExtractionProcess.getSchema(ExtractionProcess.java:234)
        at dynadrain.data.ExtractionProcess.relayTabularDataset(ExtractionProcess.java:374)
        at dynadrain.data.ExtractionProcess.process(ExtractionProcess.java:175)
        at dynadrain.data.ExtractionProcess.<init>(ExtractionProcess.java:137)
        at dynadrain.data.ExtractionProcess.main(ExtractionProcess.java:126)

···

--
Carl Burke
cburke@mitre.org

Hi Carl,

I've been using HDF5 for a while with reasonable success, but lately with full production runs I've been getting a 'addr overflow', which seems to be linked to the amount of allocated memory. Since these are files that I just created, and there were no problems reported during creation, I'm not sure what I need to do. The problems seem to occur with datasets created later in the file. The files are reasonably large (>30 GB), but not outrageous by HDF5 standards and can reasonably be expected to run up to .5 TB in the future, so I need to find a solution. Any insights you can give me into this problem and its resolution will be appreciated.

  Hmm, what't the value of the address that you are seeing for objects at your level? Can you insert a printf() in H5Oopen() and H5FD_sec2_read() to see if that value is accurately getting down into the C library correctly? I don't think it'll probably help, but upgrading to the 1.8.3 release may help...

  Quincey

···

On Oct 5, 2009, at 2:14 PM, Burke, Carl D. wrote:

I'm using a precompiled Linux distribution of 1.8.2, via JHDF5. The OS on this machine recently changed to Centos from Fedora, if that matters. Representative stack trace reported by the HDF5 native code is as follows:

Exception while opening file /hdd2/arads/repo7/AtticaFDA_CI_COA1_262307.h5
HDF5-DIAG: Error detected in HDF5 (1.8.2) thread 0:
#000: H5O.c line 242 in H5Oopen(): unable to open object
   major: Symbol table
   minor: Can't open object
#001: H5O.c line 1290 in H5O_open_name(): unable to open object
   major: Symbol table
   minor: Can't open object
#002: H5O.c line 1326 in H5O_open_by_loc(): unable to determine object class
   major: Object header
   minor: Can't get value
#003: H5O.c line 1990 in H5O_obj_class(): unable to load object header
   major: Object header
   minor: Unable to load metadata into cache
#004: H5AC.c line 1978 in H5AC_protect(): H5C_protect() failed.
   major: Object cache
   minor: Unable to protect metadata
#005: H5C.c line 5942 in H5C_protect(): can't load entry
   major: Object cache
   minor: Unable to load metadata into cache
#006: H5C.c line 10626 in H5C_load_entry(): unable to load entry
   major: Object cache
   minor: Unable to load metadata into cache
#007: H5Ocache.c line 340 in H5O_load(): unable to read object header data
   major: Object header
   minor: Read failed
#008: H5Fio.c line 109 in H5F_block_read(): read from metadata accumulator failed
   major: Low-level I/O
   minor: Read failed
#009: H5Faccum.c line 227 in H5F_accum_read(): driver read request failed
   major: Low-level I/O
   minor: Read failed
#010: H5FDint.c line 142 in H5FD_read(): driver read request failed
   major: Virtual File Layer
   minor: Read failed
#011: H5FDsec2.c line 719 in H5FD_sec2_read(): addr overflow
   major: Invalid arguments to routine
   minor: Address overflowed
ncsa.hdf.hdf5lib.exceptions.HDF5FunctionArgumentException: Invalid arguments to routine:Address overflowed
       at ncsa.hdf.hdf5lib.H5.H5Oopen(Native Method)
       at ch.systemsx.cisd.hdf5.HDF5.openObject(HDF5.java:292)
       at ch.systemsx.cisd.hdf5.HDF5Reader$2.call(HDF5Reader.java:284)
       at ch.systemsx.cisd.hdf5.HDF5Reader$2.call(HDF5Reader.java:281)
       at ch.systemsx.cisd.hdf5.cleanup.CleanUpCallable.call(CleanUpCallable.java:40)
       at ch.systemsx.cisd.hdf5.HDF5Reader.getAllAttributeNames(HDF5Reader.java:290)
       at dynadrain.data.ExtractionProcess.getSchema(ExtractionProcess.java:234)
       at dynadrain.data.ExtractionProcess.relayTabularDataset(ExtractionProcess.java:374)
       at dynadrain.data.ExtractionProcess.process(ExtractionProcess.java:175)
       at dynadrain.data.ExtractionProcess.<init>(ExtractionProcess.java:137)
       at dynadrain.data.ExtractionProcess.main(ExtractionProcess.java:126)

--
Carl Burke
cburke@mitre.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Quincey,

Unfortunately I'm using a pre-compiled library, so I don't have
direct access to the native code in order to step through it.
However, it might be an error on my end -- it looks like there
are cases where I have one process trying to read the file before
another process has flushed it. I'm checking a fix to correct
that and evaluate the results. If that fixes it, great!

In the longer term, I'll look into upgrading to a local copy
of 1.8.3 and make sure the native bindings are compiled
for JDK 1.6, so that I can get access to those bits of
information.

Thanks,
Carl

Date: Tue, 6 Oct 2009 06:18:44 -0500
From: Quincey Koziol <koziol@hdfgroup.org>
Subject: Re: [Hdf-forum] 'addr overflow' in H5FD_sec2_read(); what
are
my options?
To: hdf-forum@hdfgroup.org
Message-ID: <9FB5E9C9-1963-4C35-837B-56473D2E7550@hdfgroup.org>
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes

Hi Carl,

I've been using HDF5 for a while with reasonable success, but lately
with full production runs I've been getting a 'addr overflow', which
seems to be linked to the amount of allocated memory. Since these
are files that I just created, and there were no problems reported
during creation, I'm not sure what I need to do. The problems seem
to occur with datasets created later in the file. The files are
reasonably large (>30 GB), but not outrageous by HDF5 standards and
can reasonably be expected to run up to .5 TB in the future, so I
need to find a solution. Any insights you can give me into this
problem and its resolution will be appreciated.

Hmm, what't the value of the address that you are seeing for
objects
at your level? Can you insert a printf() in H5Oopen() and
H5FD_sec2_read() to see if that value is accurately getting down into
the C library correctly? I don't think it'll probably help, but
upgrading to the 1.8.3 release may help...

Quincey

I'm using a precompiled Linux distribution of 1.8.2, via JHDF5. The
OS on this machine recently changed to Centos from Fedora, if that
matters. Representative stack trace reported by the HDF5 native code
is as follows:

[snipped trace for length]

···

On Oct 5, 2009, at 2:14 PM, Burke, Carl D. wrote:

--
Carl Burke