A problem when saving NATIVE_LDOUBLE variables


#21

Thanks! C++ N4296 doesn’t support FP16 however similarly to 80bit extended double FP16 is a popular choice, as there is hardware support in GPGPU cards and some CPU-s. Unlike 80bit double precision FP16 is standardadized in IEEE 754-2008 see below.

H5CPP maps two user libraries Christian Rau’s Half Float and Industrial Light and Magick - OpenEXR half float to 16 bit NATIVE FLOAT. In both cases h5dump prints out the correct values on AMD64 arch see provided examples in h5cpp source tree

The IEEE 754 standard specifies a binary16 as having the following format:

  • Sign bit: 1 bit
  • Exponent width: 5 bits
  • Significand precision: 11 bits (10 explicitly stored)

Layout:

15 (msb)
| 
| 14  10
| |   |
| |   | 9        0 (lsb)
| |   | |        |
X XXXXX XXXXXXXXXX

Examples:

0 00000 0000000000 = 0.0
0 01110 0000000000 = 0.5
0 01111 0000000000 = 1.0
0 10000 0000000000 = 2.0
0 10000 1000000000 = 3.0
1 10101 1111000001 = -124.0625
0 11111 0000000000 = +infinity
1 11111 0000000000 = -infinity
0 11111 1000000000 = NAN
1 11111 1111111111 = NAN

The format is laid out as follows:
[1 bit sign | 5 bit exponent | 10 bit mantissa] the exponent encoded using an offset-binary representation, with the zero offset being 15.

HDF5 "example.h5" {
GROUP "/" {
   DATASET "type" {
      DATATYPE  16-bit little-endian floating-point
      DATASPACE  SIMPLE { ( 20 ) / ( 20 ) }
      DATA {
      (0): 55.7812, 89.375, 90.125, -29.3906, -90.3125, -48.4688, -1.17871,
      (7): 55.1875, -21.6719, 12.7578, -63.7188, 9.22656, 50.625, 37.0938,
      (14): -81.8125, -32.7812, 89.5, -28.3906, -44.125, -41.125
      }
   }
}
}

The above can be replicated with the following CAPI calls:

H5Tset_fields(handle, 15, 10, 5, 0, 10);
H5Tset_precision(handle, 16);
H5Tset_ebias(handle, 15);
H5Tset_size(handle,2);

steve