Hi Thomas!
03.07.2018 18:49, Thomas Kluyver пишет:
I’ve come across performance problems in a case where chunks are much
bigger than the default chunk cache size. The default of 1 MB cache per
dataset seems extremely small now that even laptops have multiple GB of
RAM, and HPC cluster nodes can have hundreds of GB.I found a thread about this from a couple of years ago
https://forum.hdfgroup.org/t/chuck-cache-size-proposal/3684. It looks
like having the library try to guess a good cache size is not an option,
and I’m OK with that: I’d rather have simple, predictable behaviour even
if it is wrong in some cases.
Please see also:
However, I’d like to be able to experiment with different cache sizes
without having to recompile the software in question. So I’d propose
adding an environment variable to be used like this:HDF5_CHUNK_CACHE_SIZE=128M |
This would override the default, but it could be overridden if the
application called |H5Pset_cache| or |H5Pset_chunk_cache|. Suffixes K, M
or G would multiply the number by the relevant power of 1024 to get a
size in bytes.
That’s one neat idea! However, I’m pretty sure it will break some
existing applications relying on the defaults…
Best wishes,
Andrey Paramonov