Exploring the robustness of using SWMR to write and read and display a realtime-ish stream of data. In c/c# on Windows.
If I have one app writing and another reading, it’s working as expected.
If I have both operations in one app, it works then fails at a random time with e,g, System.AccessViolationException: ‘Attempted to read or write protected memory’ when reading.
Got it, thanks. Key thing for me is being able to write and read from the same process.
Currently, I’m opening the H5 file for writing, and opening again for reading so I have two file IDs. It works but is that a valid pattern, or should I write and read using the one file ID?
When you open a file multiple times in HDF5, we try to determine if the file has already been opened and simply return a new ID for the already opened file if so. Under the hood, the file will only have been opened once, so they will share a metadata cache and there will be no SWMR. This is true in a single process, whether you are using one thread or many. Multiple processes do not share state and thus can’t use this mechanism. In this case, each process will have it’s own metadata cache and sense of file state, hence SWMR is needed if one of those processes is a writer.
The underlying file structure that we maintain in the library is reference counted, by the way. You can create and close IDs for it in any order and not make a mess.
Ah, that’s interesting thank you. Re the counting I thought I was going to have to keep track of which files were open for writing and be sure not to close them after reading, but this is good functionality.
Does it mean that I can happily read and write a file from the one process and not have to open it in SWMR at all? If so that would be great for what we need to begin with. I’ve just tried it and appears to work.
Even so, I’m glad to have delved into SWMR I bet we’ll have a good use for it.
Thank you for the help here. I’m very keen to go with the grain, to avoid gotchas that only emerge later. So these extra insights are super helpful.
So can I just ask one more thing, is this a legit design as far as H5 is concerned? A single app/process that…
Creates an H5 file in non-SWMR mode
Reads and writes from the same thread (with no need for interlocks between the two operations)
Writes new data to datasets
Creates groups and datasets on the fly
Currently this is working fine, I had it on a soak test all night. The resulting file opened fine in HDFViewer and some python scripts, with all the groups and datasets there as expected.
But if it’s only working by accident, so to speak, it would be helpful to understand that.
SWMR is not supported with parallel HDF5. It will work fine with the parallel-enabled library, but we don’t test it with the MPI-I/O VFD and parallel HDF5 so it’s officially unsupported.
I’m not a fan of configure options as it makes the library configuration and testing space exponentially more complicated. Using them as binary “build this or don’t” flags is fine (e.g., Fortran, Java wrappers), but when you change library behavior with configuration options, you get 2^n different libraries to test.