Multithreaded writing to a single file in C++


#1

Hi folks,

I am trying to write a C++ application that creates a hdf5 file with multiple datasets. I have multiple threads that get fed by a queue (every thread gets a different type of data).

The idea is to create a dataset per thread and initialize it to size 0, and every time a new item appears in the respective queue, I extend the dataset and write the data to it.

I tried this first with only a single thread, and the code worked just fine, e.g. the single dataset appeared in the expected location. But as soon as the number of threads exceeds 1, I get file access errors:

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) thread 140156561360128:
#000: ../../../src/H5F.c line 491 in H5Fcreate(): unable to create file
major: File accessibilty
minor: Unable to open file
#001: ../../../src/H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure
major: File accessibilty
minor: Unable to open file
#002: ../../../src/H5FD.c line 1821 in H5FD_lock(): driver lock request failed
major: Virtual File Layer
minor: Can't update object
#003: ../../../src/H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 11, error message = 'Resource temporarily unavailable'
major: File accessibilty
minor: Bad file ID accessed
terminate called after throwing an instance of 'H5::FileIException'
Aborted

If I read the docs correctly, the library has to be built in this case with the --enable-threadsafe option. After building it from source with the option, the app should not have any problems with multithreaded applications because it just executes file accesses sequentially.

To make it more clear for you, my app is structured like this:

main thread:
    m_file = new H5::H5File(FILENAME, H5F_ACC_TRUNC);
    H5::Group g_vehicles(m_file->createGroup(...);      //setup of file group structure
    ...

--> worker thread 1:
    H5::DataSet dset1 = new H5::DataSet(location, dtype, ...)
    while(true) {
        dset1.extend(dims);
        ...
        dset1.write(...);
    }

--> worker thread 2:
    H5::DataSet dset2 = new H5::DataSet(location, dtype, ...)
    while(true) {
        dset2.extend(dims);
        ...
        dset2.write(...);
    }

Can anybody give me a pointer or the link to some examples on how to do this correctly?

Thanks in advance,

Sebastian


#2

Can you tell us something about your OS and file system? --enable-threadsafe is a move in the right direction for the C-API. It looks like a file locking issue. Which C++ bindings are you using? Are they thread-safe?

G.


#3

Sure, here’s the output of lsb_release -a:

Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic

I am using an ext4 type file system.

I am not sure, if I can answer your question regarding the C++ bindings. I built the lib from source using the latest package provided from https://www.hdfgroup.org/packages/hdf5-1120-cmake/. The build script uses a configuration file to set/unset options. I commented in the following lines in addition to the default options of the package:

### enable thread-safety builds

set (ADD_BUILD_OPTIONS "${ADD_BUILD_OPTIONS} -DHDF5_ENABLE_THREADSAFE:BOOL=ON")
set (ADD_BUILD_OPTIONS "${ADD_BUILD_OPTIONS} -DHDF5_ENABLE_PARALLEL:BOOL=OFF")
set (ADD_BUILD_OPTIONS "${ADD_BUILD_OPTIONS} -DHDF5_BUILD_CPP_LIB:BOOL=OFF")
set (ADD_BUILD_OPTIONS "${ADD_BUILD_OPTIONS} -DHDF5_BUILD_FORTRAN:BOOL=OFF")
set (ADD_BUILD_OPTIONS "${ADD_BUILD_OPTIONS} -DHDF5_BUILD_HL_LIB:BOOL=OFF")

If I understand this correctly, then building a C++ lib that is thread-safe is not supposed to work. I do not know if this answers your question, but that is all I configured regarding the C++ lib.


#4

Based on what I see here, I’m going to guess that this is probably a file locking issue and not a thread-safety issue.

If you are on a system that doesn’t have file locking enabled (e.g., many Lustre installations), you will probably have file locking problems with HDF5 1.10,0 (which is also really, really old). The first step would be to get the source for HDF5 1.10.7 and try that to see if your problems go away. In 1.10.7 you can even build with file locking entirely disabled (via HDF5_USE_FILE_LOCKING in CMake).

File locking is mainly needed to help get multi-process single-writer/multiple-readers (SWMR) access set up and isn’t integral to HDF5 library functionality.


#5

Spot-on.

I’m not trying to blame you (Sebastian) for our goof-ups, but the configuration clearly says

set (ADD_BUILD_OPTIONS "${ADD_BUILD_OPTIONS} -DHDF5_BUILD_CPP_LIB:BOOL=OFF")

and it’s unclear where you’re picking up the C++ stuff.

As Dana says, the locking issue will go away by a combination of using a current version (1.10.7) and disabling file-locking. (The latter shouldn’t be necessary for a local file system unless you hacked your VFS layer/kernel in weird ways.)

Long story short: If you like C++ (I do too!) and want to use a thread-safe built of the HDF5 library, there are several fine choices available from the community, but stay away from the HDF5_BUILD_CPP_LIB option.

G.


#6

Hi @gheber, could you elaborate on this? I always assumed that building a C++ binding on top of thread-safe C library would keep the C++ binding threadsafe as well.

Also thread-safe does not mean concurrent. I would be surprised if the OP get any performance boost with multithreaded dataset written in the same file.


#7

I think what Gerd is referring to is the non-thread-safety of the official HDF5 C++ wrappers. There is no API wrapper lock in any of the HDF5 language wrappers or the high-level library. If you implement your own C++ wrapper and pay attention to thread safety in that, I would assume that should be fine.


#8

Dana is again spot-on (that’s why we have him :smiley:)

No. Your threads can be preemted while in a non-thread-safe wrapper and before entering the HDF5 library. The bad things that will eventually happen are on you.

Correct, for now. It’s a restriction of the current implementation.

“HDF5 should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.” (Abelson & Sussman, 1987)

G.


#9

Thanks @gheber and @derobins for getting me on the right track, I finally got that part of my application to work. The solution to my problem was a simple mutex. I am sorry for causing inconveniences.

Best regards,

S.


#10

We are glad to hear that. No inconvenience at all. That’s why we are here. G.


#11

No problem. Glad it’s working for you.