Good day everyone,
i have a bit of confusion about how hdf file space on memory is handled when writing and deleting datasets.
This is the situation: i made a small code to test how size of file changes after:
- writing data to datasets
- removing said datasets with H5Ldelete and
- repacking the file.
step 1. As expected file size increases after writing, so it is all good. For example, file size now is 4972 KB;
(between 1. and 2. i close and open the file again to make sure data has been written to disk)
step 2. After calling H5Ldelete on the datasets i would expect the file size NOT to change, as from what i have understood so far the datasets have just been made unreacheable, but have not been actually deleted from memory. Instead i see that the size of the file now is 53 KB;
step 3. After repacking, the file size of the new repacked file is 29 KB (as there are still some very small datasets left untouched by design);
Points 1 and 3 are fine for me, but i don’t understand what is happening in point 2, and i hope someone could enlighten me about it
My (wild) guess is that in point 2 the dataset’s data is cleared and freed while the header is still part of the file.
This would make sense with a similar result i had when setting the size of a dataset of length X to 0 with H5Dset_extent(), as in that case too i observed a decrease in size of the file containing the dataset.
Still it appears to me in total contrast with what i read in the documentation, where it is stated that no memory is freed until a repack (or similar action) is issued.
I am using hdf5 version 1.8.18, and all these sizes are the ones i’m reading from window’s File Explorer (which might not be the most reliable source of information maybe).
edit: final file sizes are confirmed also by H5Fget_filesize()
edit2: adding sample code that performs points 1 and 2
codeSample.txt (2.4 KB)