Somewhat related to my last question, the examples I gave before were on an
Intel Xeon, a little-endian machine. Considering once again Example 11 code
on http://www.hdfgroup.org/HDF5/doc/UG/10_Datasets.html, modified by me /
removing extraneous stuff. I am leaving the comment block because it
provides the memory layout which determines the parameters for the function
calls:
/* Define single-precision floating-point type for dataset
···
*-------------------------------------------------------------------
* size=4 byte, precision=20 bits, offset=7 bits,
* mantissa size=13 bits, mantissa position=7,
* exponent size=6 bits, exponent position=20,
* exponent bias=31.
* It can be illustrated in little-endian order as:
* (S - sign bit, E - exponent bit, M - mantissa bit,
* ? - padding bit)
*
* 3 2 1 0
* ???SEE EEEEMMMM MMMMMMMM M???
*
* To create a new floating-point type, the following
* properties must be set in the order of
* set fields -> set offset -> set precision -> set size.
* All these properties must be set before the type can function.
* Other properties can be set anytime. Derived type size cannot
* be expanded bigger than original size but can be decreased.
* There should be no holes among the significant bits. Exponent
* bias usually is set 2^(n-1)-1, where n is the exponent size.
*-------------------------------------------------------------------*/
/* I removed variable declarations */
msize = 13;
spos = 26;
epos = 20;
esize = 6;
mpos = 7;
precision = 20;
offset = 7;
datatype = H5Tcopy(H5T_IEEE_F32BE);
H5Tset_fields(datatype, spos, epos, esize, mpos, msize)
H5Tset_offset(datatype,offset)
H5Tset_precision(datatype,precision)
H5Tset_size(datatype, 4)
H5Tset_ebias(datatype, 31)
On a little-endian machine, I get expected behavior. If I want to further
reduce precision (and hence compressed file size) I can do this:
msize -= 4;
spos -= 4;
epos -= 4;
precision -= 4;
As I decrement each of the above, I end up with less precision and smaller
file sizes when followed by the gzip compression (am I doing this right? I
haven't changed offset, and it occurs to me that I probably should?)
Questions:
1. Why H5Tcopy(H5T_IEEE_F32BE) and not H5Tcopy(H5T_IEEE_F32LE)? After all,
this is a little endian machine, and the example is for a little endian
memory layout?
2. When I apply the above code on a big-endian machine (IBM Power5) I get
screwed up data. It appears I somehow have to fiddle with spos, epos, and
offset for a big endian machine perhaps?
3. Why H5Tset_size(datatype, 4) and not H5Tset_size(datatype, 2) - after
all, haven't we reduced the precision to 16 bits, i.e., 2 bytes?
My ultimate goal here is to get the proper behavior on a big-endian machine
since that's what I'm running my model on. I want to have fine-grained
control over the lossiness of the final compressed data. Perhaps if someone
could re-do Example 11 for a big endian machine things would become clearer
to me. And I'm still puzzled about why a pure n-bit filter doesn't reduce
file size (previous email).
Leigh
--
Leigh Orf
Associate Professor of Atmospheric Science
Department of Geology and Meteorology
Central Michigan University
Currently on sabbatical at the National Center for Atmospheric Research
in Boulder, CO
NCAR office phone: (303) 497-8200