Request for Suggestion: Alignment Detection Algorithm

This is Ray from the HDF Group. I'm trying to solicit suggestions from you to improve our alignment detection algorithm.

Alignment restriction can be described in this manner: “Some computers allow data objects to reside in storage at any address regardless of the data’s type. Others impose alignment restrictions on certain data types, requiring that objects of those types occupy only certain addresses. It is not unusual for a byte-addressed computer, for example, to require that 32-bit integers be located on addresses that are a multiple of four. In this case, we say that the ‘alignment modulus’ of those integers is four.” (Harbison and Steele, C: A Reference Manual, 6.1.3 Alignment Restrictions)

The HDF5 library uses the datatype's alignment in its data conversion operation. Imagine a user has a memory data buffer of H5T_NATIVE_UCHAR. On most machines, the data alignment in memory is 1 for char, meaning no alignment restriction. If the user tries to convert (in-place conversion) the data to H5T_NATIVE_INT using H5Tconvert, Our library has to know whether the compiler requires us to align data of "int" in the memory buffer.

To show it in a pseudo code, a memory buffer contains 2 elements of unsigned char. The values are 1 and 2:
  unsigned char buf[8] = {1, 2, 0, 0, 0, 0, 0, 0};
      :
  H5Tconvert(H5T_NATIVE_UCHAR, H5T_NATIVE_INT, 2, buf, NULL, H5P_DEFAULT);

After the converting to int type, the values of the two data elements become {0, 0, 0, 1, 0, 0, 0, 2} on a big-endian machine. When a compiler doesn't require alignment, the library casts the data directly and puts it in the buffer. But when a compiler requires alignment, the library has to memcpy the data to an aligned location and cast it, then memcpy it back to the user's original buffer. These extra steps can be expensive as every data element has to be treated in this way.

The HDF5 library has its own alignment detection algorithm. I attached an excerpt of this algorithm as a standalone C program. There is a problem in this algorithm: in Line 30, the casting of a pointer to an integer may cause undefined behavior for some compilers. For example, a user pointed out the CLANG compiler (Version 4.2) on Mac OS Darwin 12.5 failed the program with "Illegal Instruction" error message when the program is compiled with -fcatch-undefined-behavior flag. The C manual states: "C does give the programmer the ability to violate the alignment restrictions by casting pointers to different types." (Harbison and Steele, C: A Reference Manual, 6.1.3 Alignment Restrictions) As the HDF5 library is probing the memory alignment for a datatype, it's no surprise to violate the restriction. However, in order to improve the quality of our software, we want to find out if anybody in the community knows a better algorithm to detect an integer datatype's alignment in memory. It should NOT trigger compiler's undefined behavior.

We're NOT interested in the alignment of a datatype in a structure. It refers to the value expressed in COMP_ALIGN in the following pseudo code:

  struct {
    char c;
    TYPE x;
  } s;

  COMP_ALIGN = (char*)(&(s.x)) - (char*)(&s);

On Linux, for "int" type, the value of COMP_ALIGN is 4. The C keyword __alignof__ returns the alignment of the type in a structure, not the alignment of the type in memory. Our library's algorithm of memory alignment finds that the alignment for "int" is 1, meaning no alignment restriction, on Linux.

We'll appreciate your comments and suggestions. Thanks in advance.

Ray

align.c (912 Bytes)

In fact, you invoke undefined behaviour even earlier, at line 25. Nicely explained with this example:

<https://www.securecoding.cert.org/confluence/display/seccode/EXP36-C.+Do+not+convert+pointers+into+more+strictly+aligned+pointer+types&gt;

align2.c (924 Bytes)

···

On Tue, 19 Nov 2013 17:21:00 -0600, Raymond Lu said:

The HDF5 library has its own alignment detection algorithm. I attached
an excerpt of this algorithm as a standalone C program. There is a
problem in this algorithm: in Line 30, the casting of a pointer to an
integer may cause undefined behavior for some compilers.

---------
void func(void) {
char c = 'x';
int *ip = (int *)&c; /* This can lose information */
char *cp = (char *)ip;

/* Will fail on some conforming implementations */
assert(cp == &c);
}
---------

Notice that that assert can fail! Evil, isn't it? But that's C, like it or not! :slight_smile:

Note that the compiler dutifully warns us:

align.c:25:13: warning: cast from 'unsigned char *' to 'int *' increases required alignment from 1 to 4 [-Wcast-align]
    p_int = (int *)p;
            ^~~~~~~~

We're NOT interested in the alignment of a datatype in a structure. It
refers to the value expressed in COMP_ALIGN in the following pseudo code:

struct {
  char c;
  TYPE x;
} s;

COMP_ALIGN = (char*)(&(s.x)) - (char*)(&s);

I understand that you are *not* interested in struct offset/alignment, but *if* you were, it would be best to use offsetof() instead of the above pseudo code:
<http://en.wikipedia.org/wiki/Offsetof&gt;

On Linux, for "int" type, the value of COMP_ALIGN is 4. The C keyword
__alignof__ returns the alignment of the type in a structure, not the
alignment of the type in memory.

I believe your description of __alignof__ is incorrect, see:
<http://gcc.gnu.org/onlinedocs/gcc/Alignment.html&gt;

__alignof__ has nothing to do with structs. They even give an example showing how it ignores padding added to structs. In other words, __alignof__ is NOT the same as offsetof().

Our library's algorithm of memory
alignment finds that the alignment for "int" is 1, meaning no alignment
restriction, on Linux.

What does __alignof__(int) give on that same system? 4 I bet.

Your attached align.c also gives "1" when run on my x86_64 Mac. It's only giving us that result because we are lucky.

I have eliminated the undefined behaviour from your example (align2.c attached) by eliminating the invalid cast and using memcpy() instead. You'll notice that it will now output "alignment=1" all the time, since memcpy() is guaranteed able to copy to/from memory of any alignment.

The conclusion, I'm afraid to say, is that your test is fundamentally flawed. :frowning: It's not answering the question "what is the minimun alignment of int?" __alignof__(int) will give you that answer! It's answering the question: "if I ignore the real minimun alignment, and try less and less, when will I crash?"

The fact that your current alignment detection is returning 1 for 'int' on linux is a bug, it should return (probably) 4 (or whatever __alignof__ gives).

Back to your use case then:

---------
a memory buffer contains 2 elements of unsigned char. The values are 1 and 2:
  unsigned char buf[8] = {1, 2, 0, 0, 0, 0, 0, 0};
  H5Tconvert(H5T_NATIVE_UCHAR, H5T_NATIVE_INT, 2, buf, NULL, H5P_DEFAULT);

After the converting to int type, the values of the two data elements become {0, 0, 0, 1, 0, 0, 0, 2} on a big-endian machine.
---------

With today's HDF5 code, on a system where your current alignment test gives 4 instead of 1, what would this do?

Cheers,

--
____________________________________________________________
Sean McBride, B. Eng sean@rogue-research.com
Rogue Research www.rogue-research.com
Mac Software Developer Montréal, Québec, Canada

Hi Ray,

thanks for this.

On Linux, for "int" type, the value of COMP_ALIGN is 4. The C
keyword __alignof__ returns the alignment of the type in a structure, not
the alignment of the type in memory. Our library's algorithm of memory
alignment finds that the alignment for "int" is 1, meaning no alignment
restriction, on Linux.

I hope I am understanding the problem correctly. Reading
http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Alignment.html#Alignment it
seems to me that __alignof__(type) returns the minimum required alignment
for type as you need it. It's different if you do __alignof__(lvalue)
because in that case the alignment of lvalue might be influenced in other
ways.

__alignof__(int) correctly returns 4 on my machine (x86_64).

In general, I think it is a better approach to try to get all these
information from the compiler. This can be done in many different ways,
macro definitions, extension or even pattern matching based
on compiler/platform.

Best wishes,
Andrea

···

On 20 November 2013 10:21, Raymond Lu <songyulu@hdfgroup.org> wrote:

--
Andrea Bedini <andrea.bedini@gmail.com>