assertion in H5Tconv.c failing in 1.8.4

Hi All,

I am running into the following assertion failure in version 1.8.4.
Anyone else running into this?

I assume since its an assertion thats failing, something is wonky in
HDF5 proper and NOT my client code?

TestReadMask: H5Tconv.c:4451: H5T_conv_s_s: Assertion `(dp<sp && dp+dst->shared->size<=sp) || (sp<dp && sp+src->shared->size<=dp)' failed.
Abort (core dumped)

It is this code that is failing...

   4442 #ifndef NDEBUG
   4443 /* I don't quite trust the overlap calculations yet --rpm */
   4444 if (src->shared->size==dst->shared->size || buf_stride) {
   4445 assert(s==d);
   4446 } else if (d==dbuf) {
   4447 assert((dp>=sp && dp<sp+src->shared->size) ||
   4448 (sp>=dp && sp<dp+dst->shared->size));
   4449 } else {
   4450 assert((dp<sp && dp+dst->shared->size<=sp) ||
   4451 (sp<dp && sp+src->shared->size<=dp));
   4452 }
   4453 #endif

···

--
Mark C. Miller, Lawrence Livermore National Laboratory
email: mailto:miller86@llnl.gov
(M/T/W) (925)-423-5901 (!!LLNL BUSINESS ONLY!!)
(Th/F) (530)-753-8511 (!!LLNL BUSINESS ONLY!!)

A little follow-up of my own here...

I am compiling a --debug=all configuration. But, if I simply turn off
this block of code by replacing #ifndef NDEBUG with #if 0, then the
client that was hitting this assertion runs to completion without error.
Also runing under valgrind 3.3.1 reports no memory errors.

Mark

···

On Wed, 2009-12-02 at 12:48, Mark Miller wrote:

Hi All,

I am running into the following assertion failure in version 1.8.4.
Anyone else running into this?

I assume since its an assertion thats failing, something is wonky in
HDF5 proper and NOT my client code?

TestReadMask: H5Tconv.c:4451: H5T_conv_s_s: Assertion `(dp<sp && dp+dst->shared->size<=sp) || (sp<dp && sp+src->shared->size<=dp)' failed.
Abort (core dumped)

It is this code that is failing...

   4442 #ifndef NDEBUG
   4443 /* I don't quite trust the overlap calculations yet --rpm */
   4444 if (src->shared->size==dst->shared->size || buf_stride) {
   4445 assert(s==d);
   4446 } else if (d==dbuf) {
   4447 assert((dp>=sp && dp<sp+src->shared->size) ||
   4448 (sp>=dp && sp<dp+dst->shared->size));
   4449 } else {
   4450 assert((dp<sp && dp+dst->shared->size<=sp) ||
   4451 (sp<dp && sp+src->shared->size<=dp));
   4452 }
   4453 #endif

--
Mark C. Miller, Lawrence Livermore National Laboratory
email: mailto:miller86@llnl.gov
(M/T/W) (925)-423-5901 (!!LLNL BUSINESS ONLY!!)
(Th/F) (530)-753-8511 (!!LLNL BUSINESS ONLY!!)

Mark,

Could you run a debugger and send us the error stack? Thanks.

Ray

Mark Miller wrote:

···

Hi All,

I am running into the following assertion failure in version 1.8.4.
Anyone else running into this?

I assume since its an assertion thats failing, something is wonky in
HDF5 proper and NOT my client code?

TestReadMask: H5Tconv.c:4451: H5T_conv_s_s: Assertion `(dp<sp && dp+dst->shared->size<=sp) || (sp<dp && sp+src->shared->size<=dp)' failed.
Abort (core dumped)

It is this code that is failing...

   4442 #ifndef NDEBUG
   4443 /* I don't quite trust the overlap calculations yet --rpm */
   4444 if (src->shared->size==dst->shared->size || buf_stride) {
   4445 assert(s==d);
   4446 } else if (d==dbuf) {
   4447 assert((dp>=sp && dp<sp+src->shared->size) ||
   4448 (sp>=dp && sp<dp+dst->shared->size));
   4449 } else {
   4450 assert((dp<sp && dp+dst->shared->size<=sp) ||
   4451 (sp<dp && sp+src->shared->size<=dp));
   4452 }
   4453 #endif

Alas, I am unable to duplicate the bug now. I re-compiled and
re-installed HDF5 with --enable-debug=all and then re-compiled and
re-linked my client code. I will try again maybe next week but am
currently knee deep in other issues.

Mark

···

On Thu, 2009-12-03 at 07:47, Raymond Lu wrote:

Mark,

Could you run a debugger and send us the error stack? Thanks.

Ray

Mark Miller wrote:
> Hi All,
>
> I am running into the following assertion failure in version 1.8.4.
> Anyone else running into this?
>
> I assume since its an assertion thats failing, something is wonky in
> HDF5 proper and NOT my client code?
>
>
> TestReadMask: H5Tconv.c:4451: H5T_conv_s_s: Assertion `(dp<sp && dp+dst->shared->size<=sp) || (sp<dp && sp+src->shared->size<=dp)' failed.
> Abort (core dumped)
>
> It is this code that is failing...
>
> 4442 #ifndef NDEBUG
> 4443 /* I don't quite trust the overlap calculations yet --rpm */
> 4444 if (src->shared->size==dst->shared->size || buf_stride) {
> 4445 assert(s==d);
> 4446 } else if (d==dbuf) {
> 4447 assert((dp>=sp && dp<sp+src->shared->size) ||
> 4448 (sp>=dp && sp<dp+dst->shared->size));
> 4449 } else {
> 4450 assert((dp<sp && dp+dst->shared->size<=sp) ||
> 4451 (sp<dp && sp+src->shared->size<=dp));
> 4452 }
> 4453 #endif
>
>
>

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
email: mailto:miller86@llnl.gov
(M/T/W) (925)-423-5901 (!!LLNL BUSINESS ONLY!!)
(Th/F) (530)-753-8511 (!!LLNL BUSINESS ONLY!!)

I am hitting another assertion failure in 1.8.4...

compression: H5Dint.c:1353: H5D_close: Assertion `dataset->shared->fo_count >0' failed.
Abort (core dumped)

Here is a stack...
#0 0x0027dcef in raise () from /lib/tls/libc.so.6
#1 0x0027f4f5 in abort () from /lib/tls/libc.so.6
#2 0x00277619 in __assert_fail () from /lib/tls/libc.so.6
#3 0x00fab0db in H5D_close (dataset=0x93e7ed8) at H5Dint.c:1409
#4 0x01045e36 in H5I_dec_ref (id=83886080, app_ref=0) at H5I.c:1384
#5 0x00fc47c7 in H5F_try_close (f=0x93bec48) at H5F.c:1862
#6 0x00fc4533 in H5F_close (f=0x93bec48) at H5F.c:1753
#7 0x01045e36 in H5I_dec_ref (id=16777216, app_ref=1) at H5I.c:1384
#8 0x00fc4b43 in H5Fclose (file_id=16777216) at H5F.c:1953
#9 0x080c2b16 in db_hdf5_Close (_dbfile=0x93c0988) at silo_hdf5.c:4763
#10 0x08056b27 in DBClose (dbfile=0x93c0988) at silo.c:3988
#11 0x0804b930 in main (argc=2, argv=0xbfff91d4) at compression.c:219

This is happening following a filter failure. I am testing behavior of
Silo's compression features when the filters fail. My compression code
is aiming to compress to a given ratio. If that ratio cannot be
achieved, it should fallback to uncompressed writes. In the case that is
failing, above, I set an absurd ratio of 1000:1 expecting it cannot be
achieved. And, we hit the assertion when we close the file. Also, my
client does report it wrote '0 bytes/second' so it maybe related to some
kind of funky empty dataset.

Still, I don't think I should be hitting assertion failures in HDF5,
right?

···

--
Mark C. Miller, Lawrence Livermore National Laboratory
email: mailto:miller86@llnl.gov
(M/T/W) (925)-423-5901 (!!LLNL BUSINESS ONLY!!)
(Th/F) (530)-753-8511 (!!LLNL BUSINESS ONLY!!)

Hi Mark,

···

On Dec 2, 2009, at 3:01 PM, Mark Miller wrote:

I am hitting another assertion failure in 1.8.4...

compression: H5Dint.c:1353: H5D_close: Assertion `dataset->shared->fo_count >0' failed.
Abort (core dumped)

Here is a stack...
#0 0x0027dcef in raise () from /lib/tls/libc.so.6
#1 0x0027f4f5 in abort () from /lib/tls/libc.so.6
#2 0x00277619 in __assert_fail () from /lib/tls/libc.so.6
#3 0x00fab0db in H5D_close (dataset=0x93e7ed8) at H5Dint.c:1409
#4 0x01045e36 in H5I_dec_ref (id=83886080, app_ref=0) at H5I.c:1384
#5 0x00fc47c7 in H5F_try_close (f=0x93bec48) at H5F.c:1862
#6 0x00fc4533 in H5F_close (f=0x93bec48) at H5F.c:1753
#7 0x01045e36 in H5I_dec_ref (id=16777216, app_ref=1) at H5I.c:1384
#8 0x00fc4b43 in H5Fclose (file_id=16777216) at H5F.c:1953
#9 0x080c2b16 in db_hdf5_Close (_dbfile=0x93c0988) at silo_hdf5.c:4763
#10 0x08056b27 in DBClose (dbfile=0x93c0988) at silo.c:3988
#11 0x0804b930 in main (argc=2, argv=0xbfff91d4) at compression.c:219

This is happening following a filter failure. I am testing behavior of
Silo's compression features when the filters fail. My compression code
is aiming to compress to a given ratio. If that ratio cannot be
achieved, it should fallback to uncompressed writes. In the case that is
failing, above, I set an absurd ratio of 1000:1 expecting it cannot be
achieved. And, we hit the assertion when we close the file. Also, my
client does report it wrote '0 bytes/second' so it maybe related to some
kind of funky empty dataset.

Still, I don't think I should be hitting assertion failures in HDF5,
right?

  Hmm, I just checked and the "--enable-debug=all" setting isn't being included in our 50+ configurations for our daily testing. I'll work on correcting that and getting one of the daily test configurations to verify it. Until then, try with just "--enable-debug", which is tested in our daily regression tests.

  Quincey

Hi Mark,

I am hitting another assertion failure in 1.8.4...

compression: H5Dint.c:1353: H5D_close: Assertion `dataset->shared->fo_count >0' failed.
Abort (core dumped)

Here is a stack...
#0 0x0027dcef in raise () from /lib/tls/libc.so.6
#1 0x0027f4f5 in abort () from /lib/tls/libc.so.6
#2 0x00277619 in __assert_fail () from /lib/tls/libc.so.6
#3 0x00fab0db in H5D_close (dataset=0x93e7ed8) at H5Dint.c:1409
#4 0x01045e36 in H5I_dec_ref (id=83886080, app_ref=0) at H5I.c:1384
#5 0x00fc47c7 in H5F_try_close (f=0x93bec48) at H5F.c:1862
#6 0x00fc4533 in H5F_close (f=0x93bec48) at H5F.c:1753
#7 0x01045e36 in H5I_dec_ref (id=16777216, app_ref=1) at H5I.c:1384
#8 0x00fc4b43 in H5Fclose (file_id=16777216) at H5F.c:1953
#9 0x080c2b16 in db_hdf5_Close (_dbfile=0x93c0988) at silo_hdf5.c:4763
#10 0x08056b27 in DBClose (dbfile=0x93c0988) at silo.c:3988
#11 0x0804b930 in main (argc=2, argv=0xbfff91d4) at compression.c:219

This is happening following a filter failure. I am testing behavior of
Silo's compression features when the filters fail. My compression code
is aiming to compress to a given ratio. If that ratio cannot be
achieved, it should fallback to uncompressed writes. In the case that is
failing, above, I set an absurd ratio of 1000:1 expecting it cannot be
achieved. And, we hit the assertion when we close the file. Also, my
client does report it wrote '0 bytes/second' so it maybe related to some
kind of funky empty dataset.

Still, I don't think I should be hitting assertion failures in HDF5,
right?

  Hmm, I just checked and the "--enable-debug=all" setting isn't being included in our 50+ configurations for our daily testing. I'll work on correcting that and getting one of the daily test configurations to verify it. Until then, try with just "--enable-debug", which is tested in our daily regression tests.

This is a known problem and should be in the RELEASE.txt file (was last seen in 1.8.1 RELEASE.txt file). I apologize for the mistake.
My recollection was that to fix it will require some substantial effort and we had to downgrade priority of the task due to some other urgent issues.

Elena

···

On Dec 2, 2009, at 3:45 PM, Quincey Koziol wrote:

On Dec 2, 2009, at 3:01 PM, Mark Miller wrote:
  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

I think this may be covered by a patch I submitted recently...

···

2009/12/3 Elena Pourmal <epourmal@hdfgroup.org>

On Dec 2, 2009, at 3:45 PM, Quincey Koziol wrote:

> Hi Mark,
>
> On Dec 2, 2009, at 3:01 PM, Mark Miller wrote:
>
>> I am hitting another assertion failure in 1.8.4...
>>
>> compression: H5Dint.c:1353: H5D_close: Assertion
`dataset->shared->fo_count >0' failed.
>> Abort (core dumped)
>>
>> Here is a stack...
>> #0 0x0027dcef in raise () from /lib/tls/libc.so.6
>> #1 0x0027f4f5 in abort () from /lib/tls/libc.so.6
>> #2 0x00277619 in __assert_fail () from /lib/tls/libc.so.6
>> #3 0x00fab0db in H5D_close (dataset=0x93e7ed8) at H5Dint.c:1409
>> #4 0x01045e36 in H5I_dec_ref (id=83886080, app_ref=0) at H5I.c:1384
>> #5 0x00fc47c7 in H5F_try_close (f=0x93bec48) at H5F.c:1862
>> #6 0x00fc4533 in H5F_close (f=0x93bec48) at H5F.c:1753
>> #7 0x01045e36 in H5I_dec_ref (id=16777216, app_ref=1) at H5I.c:1384
>> #8 0x00fc4b43 in H5Fclose (file_id=16777216) at H5F.c:1953
>> #9 0x080c2b16 in db_hdf5_Close (_dbfile=0x93c0988) at silo_hdf5.c:4763
>> #10 0x08056b27 in DBClose (dbfile=0x93c0988) at silo.c:3988
>> #11 0x0804b930 in main (argc=2, argv=0xbfff91d4) at compression.c:219
>>
>> This is happening following a filter failure. I am testing behavior of
>> Silo's compression features when the filters fail. My compression code
>> is aiming to compress to a given ratio. If that ratio cannot be
>> achieved, it should fallback to uncompressed writes. In the case that is
>> failing, above, I set an absurd ratio of 1000:1 expecting it cannot be
>> achieved. And, we hit the assertion when we close the file. Also, my
>> client does report it wrote '0 bytes/second' so it maybe related to some
>> kind of funky empty dataset.
>>
>> Still, I don't think I should be hitting assertion failures in HDF5,
>> right?
>
> Hmm, I just checked and the "--enable-debug=all" setting isn't
being included in our 50+ configurations for our daily testing. I'll work
on correcting that and getting one of the daily test configurations to
verify it. Until then, try with just "--enable-debug", which is tested in
our daily regression tests.
>

This is a known problem and should be in the RELEASE.txt file (was last
seen in 1.8.1 RELEASE.txt file). I apologize for the mistake.
My recollection was that to fix it will require some substantial effort and
we had to downgrade priority of the task due to some other urgent issues.

Elena
> Quincey
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@hdfgroup.org
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org