Preventing file corruption on crash


I've discovered HDF5 last week and spent the week-end reading the user
guide. Unfortunately I still have many unanswered questions; I will post
them as separate emails, so they can be answered individually.

My fist and most critical question is, with the currently available version
of the HDF5 library, is there a way to prevent file corruption on
application crash? I have seen some threads of 2010 talking about a future
"metadata journaling" in version 1.10.x, but we are still by 1.8.x in 2012,
and so I have to assume that this feature is not available. Also, it wasn't
mentioned in the user guide anyway.

Firstly, am I correct to assume that file corruption, in the sense of a file
becoming unreadable, comes from the metadata only? In other words, if I
could make sure that the metadata was saved "transitionally" (all or
nothing), then I would always be able to read the file after a crash? (Even
if it would contain some partially written data chunks that the metadata
doesn't know about)

If this is the case, then it should be possible to prevent unreadable files
by doing the following: 1) split the file into data and metadata, 2) after a
successful flush, clone the metadata file. On crash recovery, check the
current metadata file, and if it is corrupt, go back to the previous
version. Being able to flush data and meta-data separately would make this
safer, because you could first flush the data, and only when it was
successful, flush the metadata. Depending on the file system used, and the
amount of metadata, cloning might be a fast operation. I could even store
the metadata file on a special file system just for that purpose.

Using the split driver, is there any way to flush each file (data and
metadata) individually, or at least flush data first? Could I do this if I
kept the metadata file in RAM (core)? (I can’t judge yet how much RAM it
will need)

MfG / Regards,
Sebastien Diot

Grossblittersdorferstrasse 257-259
66119 Saarbruecken
Tel.: +49 (681) 8808-0
Fax.: +49 (681) 8808-300


HR A 6448 Amtsgericht Saarbrücken
Komplementär: A. Reiß & Sohn GmbH
HR B 4965 Amtsgericht Saarbrücken
Geschäftsführer: Dipl.-Kfm. Karl-Heinz Siebenpfeiffer