Odd caching behavior with scale-offset filter

andrew.collette · January 11, 2013, 5:45pm

Hi,

A user recently contributed a patch adding support for the
scale-offset filter in h5py, and we are seeing some odd behavior which
seems to be related to data caching. When the filter is set up for
lossy encoding (e.g. storing 32-bit ints with 1 bit of precision),
when a small dataset is written, subsequent reads produce to the
original, non-compressed data. Closing and reopening the file, or
writing larger datasets, seems to produce the expected
lossily-compressed data.

Is this expected behavior? Is there any way to get HDF5 not to cache
chunks when a lossy filter is used, or, preferably, to only cache
chunks after the transformation has been applied?

Thanks,
Andrew Collette

andrew.collette · January 11, 2013, 8:04pm

Here's a C file demonstrating this behavior for floating-point data.
The output of the program is (on my machine):

Data written:
1.129840 5.123983 2.129993 -2.199330
After immediate read:
1.129840 5.123983 2.129993 -2.199330
After reopening:
1.130670 5.120670 2.130670 -2.199330

Andrew

so_test.c (1.28 KB)

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

Odd caching behavior with scale-offset filter