Crashproofing HDF5

Interested in crash proofing HDF5?

As part of our ongoing efforts to improve HDF5, 2024 marks the restarting of old efforts to crash proof HDF5. When HDF5 crashes or has failures, it can lead to data loss or corruption of the file. The HDF Group is starting work on solutions that will solve these issues. Currently, we are investigating Metadata Journaling, Write-Ahead-Log (WAL), non-VFD SWMR, VFD SWMR, checkpointing, and other potential solutions.

We would love to hear from you! What might this work mean to you? Would your organization be interested in helping fund this work or collaborating with us on a proposal regarding this work?

If you have thoughts, ideas, concerns, or are simply interested in this work, please let us know below!

5 Likes

Exploring solutions such as Metadata Journaling, Write-Ahead-Log (WAL), various forms of SWMR, and checkpointing signifies a significant evolution towards more database-like robustness for HDF5 instead of a file format.

I have been navigating a similar path by integrating the Parallax KV store with HDF5 (GitHub - gesalous/parallax_hdf5_vol_connector: HDF5 VOL plugin for Parallax key value store) as a VOL plugin. This approach solves recovery capabilities by leveraging the inherent resilience features of the KV store. However, it introduces challenges regarding backward compatibility.
I am very interested in how HDF5 is heading and its potential for transforming data storage practices. We could discuss potential collaboration plans if you are interested.

2 Likes

Sounds very good and is highly expected. We have an interactive application that allows to edit data that are mapped to HDF5, and in the unlikely but not impossible case that the application crashes, all those edits should not be lost. Status quo is that catching an exception signal and closing the library from there prevents the worst cases and the files usually remain fine. Sometimes it is required to use the h5clear utility, which unfortunately is not available as a library call, so it cannot be built into the application itself. A more “official” solution would be very useful.

Thank you for your response. I have sent you an email to further discuss collaboration plans.