SWMR R/W in two processes: good, in same process: bad


#1

Exploring the robustness of using SWMR to write and read and display a realtime-ish stream of data. In c/c# on Windows.

If I have one app writing and another reading, it’s working as expected.

If I have both operations in one app, it works then fails at a random time with e,g, System.AccessViolationException: ‘Attempted to read or write protected memory’ when reading.

If there are any ideas, I’d be grateful.


#2

Oh I think it’s fixed - I had the read and write on different threads, mea culpa.

Still, writing down the question is a great way to see the solution. And also I read a lot more documentation along the way, so, good :slight_smile:


#3

SWMR is about multiple processes, btw - not multiple threads. Different state issues.


#4

Got it, thanks. Key thing for me is being able to write and read from the same process.

Currently, I’m opening the H5 file for writing, and opening again for reading so I have two file IDs. It works but is that a valid pattern, or should I write and read using the one file ID?


#5

When you open a file multiple times in HDF5, we try to determine if the file has already been opened and simply return a new ID for the already opened file if so. Under the hood, the file will only have been opened once, so they will share a metadata cache and there will be no SWMR. This is true in a single process, whether you are using one thread or many. Multiple processes do not share state and thus can’t use this mechanism. In this case, each process will have it’s own metadata cache and sense of file state, hence SWMR is needed if one of those processes is a writer.


#6

The underlying file structure that we maintain in the library is reference counted, by the way. You can create and close IDs for it in any order and not make a mess.


#7

Ah, that’s interesting thank you. Re the counting I thought I was going to have to keep track of which files were open for writing and be sure not to close them after reading, but this is good functionality.

Does it mean that I can happily read and write a file from the one process and not have to open it in SWMR at all? If so that would be great for what we need to begin with. I’ve just tried it and appears to work.

Even so, I’m glad to have delved into SWMR I bet we’ll have a good use for it.


#8

Yes, from one process you do not need SWMR, even if you open the file multiple times and access it through multiple file IDs.


#9

Thank you for the help here. I’m very keen to go with the grain, to avoid gotchas that only emerge later. So these extra insights are super helpful.

So can I just ask one more thing, is this a legit design as far as H5 is concerned? A single app/process that…

  • Creates an H5 file in non-SWMR mode
  • Reads and writes from the same thread (with no need for interlocks between the two operations)
  • Writes new data to datasets
  • Creates groups and datasets on the fly

Currently this is working fine, I had it on a soak test all night. The resulting file opened fine in HDFViewer and some python scripts, with all the groups and datasets there as expected.

But if it’s only working by accident, so to speak, it would be helpful to understand that.


#10

That sounds like “HDF5 in action,” helping to solve another problem and doing good things.

G.


#11

That’s a ‘yes’ I think :slight_smile: Fantastic - onwards and upwards.

I’m feeling very positive about HDF5 at the moment. Thank you to the team.


#12

Have you tested SWMR with parallel option? :blush:
Don’t blame me for hurting your feeling! :smile:


#13

Parallel - no I’ve not yet. I need to learn to walk before I can run :flushed:


#14

SWMR is not supported with parallel HDF5. It will work fine with the parallel-enabled library, but we don’t test it with the MPI-I/O VFD and parallel HDF5 so it’s officially unsupported.


#15

Would you please make configure or cmake fail automatically when a user enables them both?


#16

SWMR isn’t a configure option


#17

Right! My mistake! I was confused with --enable-*-vfd options. I thought one of them was --enable-swmr-vfd.

BTW, why not configuration option? Disable SWMR by default and enable it only when a user asks.


#18

I’m not a fan of configure options as it makes the library configuration and testing space exponentially more complicated. Using them as binary “build this or don’t” flags is fine (e.g., Fortran, Java wrappers), but when you change library behavior with configuration options, you get 2^n different libraries to test.