Hi Greg,
It looks like flock(2) is available as an API call, but is not implemented for that file system, so it returns a failure code. The HDF5 library only inspects the flock() return value and not errno, so we just note the failure and our API call fails in turn.
Just out of curiosity, is this a Lustre file system? I've heard that the overhead for locking is high, so admins often disable it.
Unfortunately, there is no work-around for the file-locking calls in either HDF5 1.10.0 or 1.10.0-patch1 aside from modifying the source. Also unfortunately, you are not the only person who is tripping over the file locking issue when it is unnecessary or unwanted.
For the very short term, I'm considering putting a source patch on our website that will disable the file locking. You'll have to apply the patch and build the library yourself, but this would fix your problem. Let me check into how to best accomplish this and I'll shoot for getting this out next week sometime.
Our current plan to really fix the issue is to start by generating an RFC describing the issue and our proposed solutions. After a brief period for comments, we'll implement the changes for HDF5 1.10.1, which should be released in the very near future (mid-summer, I believe). Before the release date you'll be able to use a snapshot to get the functionality. Since this is a problem that affects several users, I'm going to be keen on getting this into a snapshot ASAP so hopefully you won't have to wait long for official functionality that addresses your problem.
Dana Robinson
Software Engineer
The HDF Group
···
-----Original Message-----
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Greg Werner
Sent: Thursday, June 2, 2016 11:29 AM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] h5fcreate 1.10 unable to lock
The metadata-related changes in hdf5 1.10 have made it possible for my (massively parallel) simulation code to restart from a checkpoint.
However, h5fcreate fails to open a file in serial (on BG/Q). In a minimal test program, a simple h5fcreate call (in a serial program running on 1
processor) results in an error (in contrast, with hdf5 1.8.10, the file is successfully created):
CALL h5fcreate_f(fileName, H5F_ACC_EXCL_F, fileId, h5err)
fails to create the file (it has zero size), and yields the errors:
HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
#000: H5F.c line 491 in H5Fcreate(): unable to create file
major: File accessibilty
minor: Unable to open file
#001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize file structure
major: File accessibilty
minor: Unable to open file
#002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
major: Virtual File Layer
minor: Can't update object
#003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 38, error message = 'Function not implemented'
major: File accessibilty
minor: Bad file ID accessed
Tried to create file, err = -1
HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
#000: H5F.c line 749 in H5Fclose(): not a file ID
major: Invalid arguments to routine
minor: Inappropriate type
Creating files in parallel does not seem to be a problem.
Is there a work-around? Even in the full simulation code, this is a straightforward serial open/read/write by a single process; there is no danger of multiple readers, etc.
Might this issue be fixed by the recent patch to 1.10?
Thanks,
Greg.
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5