[BUG] Compression doesn't compress data on SWMR


#1

Hello!

To reproduce:

  1. Build and run attached
    test980.c (1.4 KB)
    Observe the size of test.h5 is 5,705 bytes (correct).

  2. Uncomment

H5Fstart_swmr_write(file);

rebuild and run.
Now the file size exploded to 1,913,763 bytes!

Originally reported in in https://github.com/h5py/h5py/issues/980
(see more complete example there for motivation behind H5Fflushing repeatedly).

Best wishes,
Andrey Paramonov


SWMR - extend dataset by 1000 records and write records one by one
#2

Hi Andrey,

It is not a bug but a feature: file space recycling is disabled in SWMR mode.

Every time the program flushes a chunk after adding a new element, the chunk is compressed and written to the file. When another element is written to the chunk and chunk is compressed and flushed, it is written to the new space in the file.

If you move out H5Fflush call outside “for loop”, file size is as expected because chunk stays in cache and is flushed when fully written.

This SWMR limitation is documented in SWMR User’s Guide https://portal.hdfgroup.org/display/HDF5/HDF5+Single-Writer+Multiple-Reader+User’s+Guide . See section 3 “SWMR Scope and Limitations”, 5th bullet. I guess we need to explain this better and your example is very helpful.

Thank you!

Elena