I have a piece of code that writes very large files (>4TB) in parallel. I am worried about the behaviour of the library when the file size grows beyond what is set at the file-system level by ulimit -f. This is not specified in the documentation as far as I can tell.
Will hdf5 produce some form of error message during a write? Or will it silently fail and produce truncated files? Or is that limit somehow bypassed?
Thanks in advance for any clarification of the behaviour.
Good question! I set the file limit to low value ulimit -f 10000 then ran a profile example with 12GB HDF5 file size, HDF5 error stack set to default (printing enabled): File size limit exceeded (core dumped) trying to dump the created file leads to: h5dump error: unable to open file "tick.h5"
It got this far: -rw-r--r-- 1 steven steven 102400000 Nov 18 09:07 tick.h5
As it appears, not from the HDF5 CAPI. Notice that the posted message is from the shell. The HDF5 library appears to crash and there is no HDF5 error stack printed. On my system, one has to be certain there is enough room left on device before starting IO operations.
From the application’s perspective there must be enough room. Here is the setrlimit on it:
RLIMIT_FSIZE
This is the maximum size in bytes of files that the process
may create. Attempts to extend a file beyond this limit
result in delivery of a SIGXFSZ signal. By default, this sig‐
nal terminates a process, but a process can catch this signal
instead, in which case the relevant system call (e.g.,
write(2), truncate(2)) fails with the error EFBIG.
and the signal man page:
Standard signals:
...
SIGXFSZ P2001 Core File size limit exceeded (4.2BSD);
see setrlimit(2)