HDF5 1.8.12 release candidate (pre1) is available for testing

Good evening,

Several of us including Quincey and Elena looked at the issue.

Hmmm… my role was to make sure that developers look at the code :slight_smile: But I take full responsibility for this answer to the forum since our decision and how we arrived at it is not explained at all!

Sean,

Thank you for looking into the issue and for holding our feet to the fire! I hope we will be able to resolve the problem very soon.

Elena

···

On Oct 17, 2013, at 4:01 PM, Sean McBride <sean@rogue-research.com> wrote:

On Thu, 17 Oct 2013 14:47:59 -0500, Raymond Lu said:

We
decided that since the algorithm is trying to detect the alignment of
integers, ideally the flag -fcatch-undefined-behavior and other
optimization flags shouldn't to be used for H5detect.c. In the future,
we can separate flags for H5detect.c from the rest of the library.

(For those who don't know what Issue 8147 is: CLANG compiler on mac
machines with the options -fcatch-undefined-behavior and -ftrapv catches
some undefined behavior in the alignment algorithm of the macro DETECT_I
in H5detect.c.)

Thanks for your followup!

Please let us know your thoughts.

I must respectfully disagree. :slight_smile: I'll try to convince you... :slight_smile:

Perhaps you misunderstand the purpose and meaning of -fcatch-undefined-behavior: it is not an optimization flag, it is a bug finding tool. It has found a bug in HDF5. Ignoring it is not helpful. :slight_smile:

For a good read on undefined behaviour (really worth your time!) see:

"What Every C Programmer Should Know About Undefined Behavior"
<http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>

C compilers are getting very smart these days, and performing optimizations based on the (valid!) assumption that undefined behaviour *must not* occur. For a great example see "A Fun Case Analysis" here:
<http://blog.regehr.org/archives/213>

HDF's own HISTORY-1.8.txt even discusses problems at higher optimization levels:

----------
* For gcc v4.3 and v4.4, with production mode, if -O3 is used, H5Tinit.c
would fail to compile. Actually bad H5Tinit.c is produced. If -O (same
as -O1) is used, H5Tinit.c compiled okay but test/dt_arith would fail.
When -O0 (no optimizatio) is used, H5Tinit.c compilete okay and all
tests passed. Therefore, -O0 is imposed for v4.3 and v4.4 of gcc.
AKC - 2009/04/20
----------

And lo and behold, dt_arith is one of the tests that fails when -fcatch-undefined-behavior is enabled:
<http://cdash.hdfgroup.uiuc.edu/testDetails.php?test=299376&build=9516>

I'd wager that's exactly what I'm describing: gcc's optimizer is in fact doing nothing wrong, rather, HDF5's code invokes undefined behaviour in several places, and so the compiler can generate whatever garbage it wants to.

-fcatch-undefined-behavior is a tool to catch (some of) these problems in debug mode, before the optimizer screws you.

Consider this obvious example:

----------
int main (void)
{
float big = 1e20;
unsigned char small = big;

printf ("small is %u \n", small);

return 0;
}
----------

unsigned char (on most platforms at least) has too little range to hold that big number. What do you expect this code to do? It's undefined behaviour, so the compiler can do whatever it wants! Let's see:

$ clang -O0 test.c
$ ./a.out
small is 0

$ clang -O3 test.c
$ ./a.out
small is 255

Now, let's use modern tools:

$ clang -fsanitize=undefined-trap test.c
$ ./a.out
test.c:8:13: runtime error: value 1e+20 is outside the range of representable values of type 'unsigned char'
small is 0

Bug identified! Line number included!

IIRC, the DETECT_I macro is violating alignment rules, doing something like this:

----------
int main (void)
{
char big[] = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66};
int* foo = (int*) &(big[1]);
int bar = *foo;

printf ("bar is %x \n", bar);

return 0;
}
----------

$ clang -fsanitize=undefined-trap test.c
$ ./a.out
test.c:9:13: runtime error: load of misaligned address 0x7fff50ab69a6 for type 'int', which requires 4 byte alignment
0x7fff50ab69a6: note: pointer points here
ab 50 ff 00 11 22 33 44 55 66 00 00 00 00 c8 69 ab 50 ff 7f 00 00 e1 47 39 89 ff 7f 00 00 e1 47
            ^
bar is 44332211

You can't do that, see:

<https://www.securecoding.cert.org/confluence/display/seccode/EXP36-C.+Do+not+convert+pointers+into+more+strictly+aligned+pointer+types>

In other words, DETECT_I will *detect the wrong thing* depending how the compiler decides to react to the invalid code.

Phew! I hope I've convinced you now... :slight_smile:

Cheers,

--
____________________________________________________________
Sean McBride, B. Eng sean@rogue-research.com
Rogue Research www.rogue-research.com
Mac Software Developer Montréal, Québec, Canada

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

I second Sean's opinion in this matter. We should truly address the issue and not paper over it by ignoring flags designed to suss it out.

Dana

···

-----Original Message-----
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf
Of Sean McBride
Sent: Thursday, October 17, 2013 5:02 PM
To: Raymond Lu; HDF Users Discussion List
Subject: Re: [Hdf-forum] HDF5 1.8.12 release candidate (pre1) is available for
testing

On Thu, 17 Oct 2013 14:47:59 -0500, Raymond Lu said:

>Several of us including Quincey and Elena looked at the issue. We
>decided that since the algorithm is trying to detect the alignment of
>integers, ideally the flag -fcatch-undefined-behavior and other
>optimization flags shouldn't to be used for H5detect.c. In the future,
>we can separate flags for H5detect.c from the rest of the library.
>
>(For those who don't know what Issue 8147 is: CLANG compiler on mac
>machines with the options -fcatch-undefined-behavior and -ftrapv
>catches some undefined behavior in the alignment algorithm of the macro
>DETECT_I in H5detect.c.)

Thanks for your followup!

>Please let us know your thoughts.

I must respectfully disagree. :slight_smile: I'll try to convince you... :slight_smile:

Perhaps you misunderstand the purpose and meaning of -fcatch-undefined-
behavior: it is not an optimization flag, it is a bug finding tool. It has found a
bug in HDF5. Ignoring it is not helpful. :slight_smile:

For a good read on undefined behaviour (really worth your time!) see:

"What Every C Programmer Should Know About Undefined Behavior"
<http://blog.llvm.org/2011/05/what-every-c-programmer-should-
know.html>

C compilers are getting very smart these days, and performing optimizations
based on the (valid!) assumption that undefined behaviour *must not*
occur. For a great example see "A Fun Case Analysis" here:
<http://blog.regehr.org/archives/213>

HDF's own HISTORY-1.8.txt even discusses problems at higher optimization
levels:

----------
* For gcc v4.3 and v4.4, with production mode, if -O3 is used, H5Tinit.c
  would fail to compile. Actually bad H5Tinit.c is produced. If -O (same
  as -O1) is used, H5Tinit.c compiled okay but test/dt_arith would fail.
  When -O0 (no optimizatio) is used, H5Tinit.c compilete okay and all
  tests passed. Therefore, -O0 is imposed for v4.3 and v4.4 of gcc.
  AKC - 2009/04/20
----------

And lo and behold, dt_arith is one of the tests that fails when -fcatch-
undefined-behavior is enabled:
<http://cdash.hdfgroup.uiuc.edu/testDetails.php?test=299376&build=9516>

I'd wager that's exactly what I'm describing: gcc's optimizer is in fact doing
nothing wrong, rather, HDF5's code invokes undefined behaviour in several
places, and so the compiler can generate whatever garbage it wants to.

-fcatch-undefined-behavior is a tool to catch (some of) these problems in
debug mode, before the optimizer screws you.

Consider this obvious example:

----------
int main (void)
{
  float big = 1e20;
  unsigned char small = big;

  printf ("small is %u \n", small);

  return 0;
}
----------

unsigned char (on most platforms at least) has too little range to hold that big
number. What do you expect this code to do? It's undefined behaviour, so
the compiler can do whatever it wants! Let's see:

$ clang -O0 test.c
$ ./a.out
small is 0

$ clang -O3 test.c
$ ./a.out
small is 255

Now, let's use modern tools:

$ clang -fsanitize=undefined-trap test.c $ ./a.out
test.c:8:13: runtime error: value 1e+20 is outside the range of representable
values of type 'unsigned char'
small is 0

Bug identified! Line number included!

IIRC, the DETECT_I macro is violating alignment rules, doing something like
this:

----------
int main (void)
{
  char big[] = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66};
  int* foo = (int*) &(big[1]);
  int bar = *foo;

  printf ("bar is %x \n", bar);

  return 0;
}
----------

$ clang -fsanitize=undefined-trap test.c $ ./a.out
test.c:9:13: runtime error: load of misaligned address 0x7fff50ab69a6 for type
'int', which requires 4 byte alignment
0x7fff50ab69a6: note: pointer points here ab 50 ff 00 11 22 33 44 55 66 00 00
00 00 c8 69 ab 50 ff 7f 00 00 e1 47 39 89 ff 7f 00 00 e1 47
             ^
bar is 44332211

You can't do that, see:

<https://www.securecoding.cert.org/confluence/display/seccode/EXP36-
C.+Do+not+convert+pointers+into+more+strictly+aligned+pointer+types>

In other words, DETECT_I will *detect the wrong thing* depending how the
compiler decides to react to the invalid code.

Phew! I hope I've convinced you now... :slight_smile:

Cheers,

--
__________________________________________________________
__
Sean McBride, B. Eng sean@rogue-research.com
Rogue Research www.rogue-research.com
Mac Software Developer Montréal, Québec, Canada

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

For entertainment see the config/gnu-flags file that is used by configure with the gcc compilers :slight_smile:

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Oct 15, 2013, at 11:16 AM, Michael Jackson mike.jackson@bluequartz.net wrote:

Sean,
  Where is that setting done? It may be a left over from when I helped port HDF5 to CMake. Can you just paste the few lines surrounding the cmake code?
Thx
--
Mike Jackson <www.bluequartz.net>

On Oct 15, 2013, at 12:13 PM, "Sean McBride" <sean@rogue-research.com> wrote:

On Fri, 11 Oct 2013 14:12:46 -0500, Elena Pourmal said:

Yes, it will be helpful if there are less warnings on CDash.

I tried to disable more warnings, but it didn't seem to stick. On a hunch, I searched the HDF5 codebase for "Wall" and found that HDF5 itself is enabling -Wall and other warnings.

As I hack, I removed them, and now have something with fewer warning:
<http://cdash.hdfgroup.uiuc.edu/viewBuildError.php?type=1&buildid=9503>

It's the same warnings repeated (a missing #include), should be easy to fix.

Maybe you should remove the hardcoded enabling of warnings in HDF, until things get cleaned up more?

Cheers,

--
____________________________________________________________
Sean McBride, B. Eng sean@rogue-research.com
Rogue Research www.rogue-research.com
Mac Software Developer Montréal, Québec, Canada

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Hi there,

I second Sean's opinion too. Actually it is the same point I was trying to
make in http://article.gmane.org/gmane.comp.programming.hdf/4805 and
relative thread. No doubt Sean did a much better job explaining!

Andrea

···

On 18 October 2013 08:44, Dana Robinson <derobins@hdfgroup.org> wrote:

I second Sean's opinion in this matter. We should truly address the issue
and not paper over it by ignoring flags designed to suss it out.

Dana

> -----Original Message-----
> From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf
> Of Sean McBride
> Sent: Thursday, October 17, 2013 5:02 PM
> To: Raymond Lu; HDF Users Discussion List
> Subject: Re: [Hdf-forum] HDF5 1.8.12 release candidate (pre1) is
available for
> testing
>
> On Thu, 17 Oct 2013 14:47:59 -0500, Raymond Lu said:
>
> >Several of us including Quincey and Elena looked at the issue. We
> >decided that since the algorithm is trying to detect the alignment of
> >integers, ideally the flag -fcatch-undefined-behavior and other
> >optimization flags shouldn't to be used for H5detect.c. In the future,
> >we can separate flags for H5detect.c from the rest of the library.
> >
> >(For those who don't know what Issue 8147 is: CLANG compiler on mac
> >machines with the options -fcatch-undefined-behavior and -ftrapv
> >catches some undefined behavior in the alignment algorithm of the macro
> >DETECT_I in H5detect.c.)
>
> Thanks for your followup!
>
> >Please let us know your thoughts.
>
> I must respectfully disagree. :slight_smile: I'll try to convince you... :slight_smile:
>
> Perhaps you misunderstand the purpose and meaning of -fcatch-undefined-
> behavior: it is not an optimization flag, it is a bug finding tool. It
has found a
> bug in HDF5. Ignoring it is not helpful. :slight_smile:
>
> For a good read on undefined behaviour (really worth your time!) see:
>
> "What Every C Programmer Should Know About Undefined Behavior"
> <http://blog.llvm.org/2011/05/what-every-c-programmer-should-
> know.html>
>
> C compilers are getting very smart these days, and performing
optimizations
> based on the (valid!) assumption that undefined behaviour *must not*
> occur. For a great example see "A Fun Case Analysis" here:
> <http://blog.regehr.org/archives/213>
>
> HDF's own HISTORY-1.8.txt even discusses problems at higher optimization
> levels:
>
> ----------
> * For gcc v4.3 and v4.4, with production mode, if -O3 is used, H5Tinit.c
> would fail to compile. Actually bad H5Tinit.c is produced. If -O (same
> as -O1) is used, H5Tinit.c compiled okay but test/dt_arith would fail.
> When -O0 (no optimizatio) is used, H5Tinit.c compilete okay and all
> tests passed. Therefore, -O0 is imposed for v4.3 and v4.4 of gcc.
> AKC - 2009/04/20
> ----------
>
> And lo and behold, dt_arith is one of the tests that fails when -fcatch-
> undefined-behavior is enabled:
> <http://cdash.hdfgroup.uiuc.edu/testDetails.php?test=299376&build=9516>
>
> I'd wager that's exactly what I'm describing: gcc's optimizer is in fact
doing
> nothing wrong, rather, HDF5's code invokes undefined behaviour in several
> places, and so the compiler can generate whatever garbage it wants to.
>
> -fcatch-undefined-behavior is a tool to catch (some of) these problems in
> debug mode, before the optimizer screws you.
>
> Consider this obvious example:
>
> ----------
> int main (void)
> {
> float big = 1e20;
> unsigned char small = big;
>
> printf ("small is %u \n", small);
>
> return 0;
> }
> ----------
>
> unsigned char (on most platforms at least) has too little range to hold
that big
> number. What do you expect this code to do? It's undefined behaviour,
so
> the compiler can do whatever it wants! Let's see:
>
> $ clang -O0 test.c
> $ ./a.out
> small is 0
>
> $ clang -O3 test.c
> $ ./a.out
> small is 255
>
> Now, let's use modern tools:
>
> $ clang -fsanitize=undefined-trap test.c $ ./a.out
> test.c:8:13: runtime error: value 1e+20 is outside the range of
representable
> values of type 'unsigned char'
> small is 0
>
> Bug identified! Line number included!
>
> IIRC, the DETECT_I macro is violating alignment rules, doing something
like
> this:
>
> ----------
> int main (void)
> {
> char big[] = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66};
> int* foo = (int*) &(big[1]);
> int bar = *foo;
>
> printf ("bar is %x \n", bar);
>
> return 0;
> }
> ----------
>
> $ clang -fsanitize=undefined-trap test.c $ ./a.out
> test.c:9:13: runtime error: load of misaligned address 0x7fff50ab69a6
for type
> 'int', which requires 4 byte alignment
> 0x7fff50ab69a6: note: pointer points here ab 50 ff 00 11 22 33 44 55
66 00 00
> 00 00 c8 69 ab 50 ff 7f 00 00 e1 47 39 89 ff 7f 00 00 e1 47
> ^
> bar is 44332211
>
> You can't do that, see:
>
> <https://www.securecoding.cert.org/confluence/display/seccode/EXP36-
> C.+Do+not+convert+pointers+into+more+strictly+aligned+pointer+types>
>
> In other words, DETECT_I will *detect the wrong thing* depending how the
> compiler decides to react to the invalid code.
>
> Phew! I hope I've convinced you now... :slight_smile:
>
> Cheers,
>
> --
> __________________________________________________________
> __
> Sean McBride, B. Eng sean@rogue-research.com
> Rogue Research www.rogue-research.com
> Mac Software Developer Montréal, Québec, Canada
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@lists.hdfgroup.org
>
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

--
Andrea Bedini <andrea.bedini@gmail.com>

You might want to see what MPICH does for "--enable-strict": MPICH
gives a huge list of options, and the configure script sees which ones
the compiler knows about.

It's not a perfect match for what you've done, since I see a few
"gcc-X.Y does not handle this one well", but it might take a bite out
of your 815 line (!) shell sub-script.

==rob

···

On Tue, Oct 15, 2013 at 11:35:10PM -0500, Elena Pourmal wrote:

For entertainment see the config/gnu-flags file that is used by configure with the gcc compilers :slight_smile:

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Rob,

Thank you for the hint! We will take a look.

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Oct 16, 2013, at 9:12 AM, Rob Latham robl@mcs.anl.gov wrote:

On Tue, Oct 15, 2013 at 11:35:10PM -0500, Elena Pourmal wrote:

For entertainment see the config/gnu-flags file that is used by configure with the gcc compilers :slight_smile:

You might want to see what MPICH does for "--enable-strict": MPICH
gives a huge list of options, and the configure script sees which ones
the compiler knows about.

It's not a perfect match for what you've done, since I see a few
"gcc-X.Y does not handle this one well", but it might take a bite out
of your 815 line (!) shell sub-script.

==rob

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org