SWMR problems - intermittent failures


#1

EDIT - I may have fixed it, a detail possibly missed. (I don’t have perms to delete the post)

But can I check that H5Fstart_swmr_write and “close and reopen with SWMR settings” both do the same thing?

Also, is SWMR robust even when reading and writing frequently? Robust like a RDBMS. Or are there limits or best practices.

OP I’m writing a c/c# app to experiment with writing and reading an H5 file continuously. I believe I’ve followed the pattern in the documentation, and it basically works, but…


#2

SWMR is not a client/server arrangement. There is no inter-process communication. SWMR, via the HDF5 library, runs in the context of the writer and reader processes and will be as robust or fragile as those processes are.

G.


#3

For completeness, part of this particular problem was that I needed to know about the HDF5_USE_FILE_LOCKING=“FALSE” environment variable. I had two readers working on one file. Without this set, they did seem to work but with intermittent faults. I didn’t see this in the HDF5 Single-writer/Multiple-reader User’s Guide but came across it elsewhere. I think it’s a rather relevant to using swmr.

BTW my question about robustness is more about real-world usage. Yes I understand it’s not a client/server system but to be fair the documentation does say, “A beneficial side effect of using SWMR access is better fault tolerance. It is more difficult to corrupt a file when using SWMR.” so it’s a valid question and not a criticism. As a new user I’d be interested to understand whether people use an H5 format as primary data collection store, and find it generally reliable.

I’m weighing up options, to see whether we need just H5 or (e.g.) a SQL DB for data-collection solidity, and H5 for archiving & data exchange - for which it looks brilliant.


#4

Many people collect data into HDF5. The HDF5 library is reliable, but if a writer process crashes and doesn’t fully flush the metadata cache(s) you can wind up with corrupt files or lose data. SWMR is based on flush dependencies which help ensure that incomplete metadata doesn’t get written out to the disk, making the file less likely to be invalid even with a crash.