External link and file locking disabled issue

Hi,

We stumbled into an unexpected behavior when opening a hdf5 file with file locking disabled and accessing an external link: The file containing the external link is accessed with file locking enabled.

We were expecting that by default the external link files would be opened with the same file locking as the main file, i.e., the file locking would be “inherited” like it is the case for the read/write mode.

This script reproduces the issue: accessing data in the main file works but not in the external link file:
link_locking_issue.py (757 Bytes)

The main file is opened by passing locking=False to h5py.File (which calls H5Pset_file_locking to disable file locking in the file access property list passed to open the file).
Then h5py opens the external link with a default file access property list, and so with file locking:
See https://github.com/h5py/h5py/blob/6b512e5edf80f6660e0f07c704ab59faa733008b/h5py/_hl/group.py#L357 and https://github.com/h5py/h5py/blob/6b512e5edf80f6660e0f07c704ab59faa733008b/h5py/_hl/base.py#L129-L135

I’m raising this issue here rather than in the h5py repository, because even if this might be fixable in h5py, I was expecting that libhdf5 would at least by default inherit the file access property list (or some of it) from the main file when accessing an external link (related to External links & VFDs).

To try and propose a fix for this issue to h5py, I looked at retrieving the file locking value from the opened file id’s access property list in order to use it for opening external links.
However it seems the file locking value is not set in the access property list retrieved from the opened file. This script reproduce the issue:

fapl_locking_issue.py (513 Bytes)

Any idea how to tackle this issue? What is the proper way to handle file access property list?

Best,
Thomas

Post copied to External link and file-locking disabled issue · Issue #4011 · HDFGroup/hdf5 · GitHub

Hi @thomas.vincent,

I believe this is mostly an oversight and the library should be fixed so that files opened through external links inherit the file locking settings from the parent file when a default FAPL is used. I have a branch now that does this, however a few things I noted:

  • When I run your first example, I actually get a “file locking flag values don’t match” failure when opening the main file with locking=False during the read() method. This is because the __main__ method already has the main file open in append mode with the default file locking setting of on (at least on my machine and with my default build of HDF5). Did you otherwise have file locking in HDF5 disabled through the environment variable or the library configure/build time option?

  • While in my branch the library will now cause the file locking setting to be inherited when a default FAPL is used during opening of files through external links, from your second link h5py appears to be creating a FAPL so that it can call set_fclose_degree on it before setting it on the LAPL with set_elink_fapl. In this case, the library will not cause the external file to inherit the parent file’s locking setting since the FAPL used isn’t a default FAPL. That said, with my changes you should be able to properly retrieve the setting from the parent file’s access property list (your second example works correctly with my branch) and set that on the FAPL as well before it gets set on the LAPL with set_elink_fapl.

Hi @jhenderson ,

Thanks for your answer!

When I run your first example, I actually get a “file locking flag values don’t match”

I tested it on macOS. On Linux, I get the same issue has you.
This is due to the way Python multiprocessing starts Process by default: “spawn” on macOS but “fork” on Linux… Please check with this version of the script which uses spawn on all platforms: link_locking_issue.py (800 Bytes)

the file locking setting to be inherited when a default FAPL is used during opening of files through external links

Sounds good, thanks!

you should be able to properly retrieve the setting from the parent file

Great, if it’s not possible for h5py to use a default fapl, then this will be really useful.

BTW, let me know if/when your branch with the fixes is publicly available, so I can check what can be done on h5py side with it.

Best,
Thomas

Hi @thomas.vincent ,

the fix to at least allow you to retrieve the file locking settings from a file’s FAPL was merged into the develop branch in Fix H5F_get_access_plist to copy file locking settings by jhendersonHDF · Pull Request #4030 · HDFGroup/hdf5 · GitHub and Fix issue with FAPL file locking setting inheriting test by jhendersonHDF · Pull Request #4053 · HDFGroup/hdf5 · GitHub. The fix should be in the 1.14.4 release currently targeted for the end of this month, but you can also test with a source build of HDF5 using the develop branch.