If a disk becomes full while writing datasets, you get an OSError with errno 28, which is expected. However, h5py.File.close() also throws an OSError exception. If you terminate the application, you get further OSErrors with same errno, probably from open datasets.
If you remove the file on OS level with ‘rm’, it is no longer listed, however, the disk is still full (as shown with ‘df’). Only if the application is terminated, the disk space is freed.
What is the proper way to close an HDF file, if the disk is full?
I have put together a small example script to demonstrate the problem: diskfull.py (1.6 KB)
I used a fixed path inside the script to point to /tmp dir, which is rather small. When running it on my PC, I get this output:
xspadmin@xspdemo03:~/tmp$ ./diskfull.py
error at index 559: No space left on device
close failed: No space left on device
Segmentation fault
The segfault is probably during destruction of the Writer class and the h5py.File class therein.
The other problem is, that as long as the script is running (i.e. you have some code to wait for user input before deleting ‘w’), then the HDF file is not released and you still see it in ‘lsof’, meaning the disk is still full, even if you just called File.close().
Surround the critical parts of your code, especially file operations, with try-except blocks to catch and handle exceptions. When you encounter a disk full condition, log the exception, and consider closing the file properly.
import h5py
try:
# Your HDF file operations here
with h5py.File(‘your_file.hdf5’, ‘w’) as file:
# Your dataset writing operations
# …
except OSError as e:
if e.errno == 28: # Disk full error
print(“Disk full error. Closing the file.”)
# Close the file if it’s still open
if file:
file.close()
else:
# Handle other OSError cases
print(f"Error: {e}")
except OSError as e:
if e.errno == 28: # Disk full error
print(“Disk full error. Closing the file.”)
# Close the file if it’s still open
if file:
file.close()
else:
# Handle other OSError cases
print(f"Error: {e}")
There is currently no “proper” way to close an HDF5 file when no storage is left on the device(s) you write to. Leaving aside the question of what “proper” could mean, the problem is that the state of an HDF5 file that was opened for writing consists of modified or new bytes in memory and bytes in storage. In a “disk full”-situation, you ask the library to decide what part of the state to scrap to “save your skin.” In the current implementation, the library has no concept of transactions or logic to make such decisions. Could this be implemented? Of course. Who wants to contribute and support the development?
Unfortunately, file.close() also throws an OSError exception, thus a single try-except is not working in all situations. Especially, if you have many datasets opened, then close() will throw again for each one while trying to close them.