Hi Ger, Francesc,
Hi Ger,
A Thursday 11 September 2008, Ger van Diepen escrigué:
> Hi Francesc, Dimitris, Quincey,
>
> The type of transactions determine the type of locking you want to
> do. Databases have typically small transactions, so they need
> fine-grained locking; hold the lock for a short period of time and
> hold the lock for only that part of the database that needs to be
> changed.
>
> I think HDF5 does not fall into this category. Usually a lot of data
> is read or written, so a lock on the entire file is fine. Furthermore
> a lock is held for a longer time period, so the overhead of having to
> close and reopen the file can be acceptable.Yeah, this is my impression too.
> It is, however, somewhat cumbersome that you also have to close and
> reopen all the groups, datasets, etc, so it would be nice if you
> could use lock/unlock instead of having to open and close the file.
> But I fear there is not much you can do about that. You just cannot
> be sure that another process did not change the data structures in
> the file unless HDF5 uses some clever (but probably very hard to
> implement) schemes.Cumbersome? in what sense? For example, PyTables keeps track of all its
opened nodes (they live in its own internal metadata LRU cache), and
when the user ask to close the file, all the opened nodes (groups,
datasets) are closed automatically (in both PyTables and HDF5 levels).
I don't know about HDF5, but if it doesn't do the same, that would be a
handy thing to implement (bar side effects that I don't see right now).> Maybe Francesc and Dimitris can explain what kind of lock granularity
> they would like to have and what scenarios they are thinking of. I
> can imagine that Francesc would like some finer grained locking for
> the PyTables.
> One must also consider the overhead in doing unnecessary unlocking.
> I.e. if a process only does lock/unlock because there might be
> another process accessing the file, you may do a lot of unnecessary
> flushing.I'm mainly looking about the locking functionality because a user
required it:http://www.pytables.org/trac/ticket/185
And well, locking at file level would be enough for the time being, yes.
More fine grained locking would require direct HDF5 support for this,
and I am afraid that that would imply too many changes on it.> Note that file locking is supported over NFS, but AFAIK NFS does not
> fully guarantee that the remote cache is updated when a file gets
> changed.Yeah, I don't know lately, but figthing with locking and NFS has always
been a difficult subject, to say the least.Cheers,
Francesc> Also note that Unix/Linux does not remove a file until all file
> handles accessing it are closed. So if one process deletes the file,
> the other one can still access it. I don't know about Windows.
>
> Cheers,
> Ger
>
> >>> "Dimitris Servis" <servisster@gmail.com> 09/10/08 8:16 PM >>>
>
> Hi Quincey,
>
>
> > Hi Dimitris,
> >
> >> Quincey,
> >>
> >> A Wednesday 10 September 2008, escriguéreu:
> >> > Hi Francesc,
> >> >
> >> > > A Tuesday 09 September 2008, Francesc Alted escrigué:
> >> > >> A Tuesday 09 September 2008, Quincey Koziol escrigué:
> >> > >> [clip]
> >> > >>
> >> > >>>>> You are fighting the metadata cache in HDF5.
> >>
> >> Unfortunately
> >>
> >> > >>>>> there's currently no way to evict all the entries from the
> >> > >>>>> cache, even if you call H5Fflush(), so it's very likely
>
> that
>
> >> > >>>>> one or more of the processes will be dealing with stale
> >> > >>>>> metadata. I've added a new feature request to our bugzilla
> >> > >>>>> database and maybe we'll be able to act on it at some
>
> point.
>
> >> > >>>> I see. At any rate, I find it curious that locking using a
> >> > >>>> regular file
> >> > >>>> works flawlessly in the same scenario.
> >> > >>>
> >> > >>> Locking using a regular file works because you are closing
> >> > >>> &
>
> re-
>
> >> > >>> opening the HDF5 file for each process (which flushes all
> >> > >>> the metadata changes to the file on closing and re-reads
> >> > >>> them on re-opening the file).
> >> > >>
> >> > >> So, when using the HDF5 file itself for locking, as the lock
> >> > >> process happens after the library has already opened the file
>
> then
>
> >> > >> it already has read bits from stalled metadata cache. Now I
> >> > >> definitely see it.
> >> > >
> >> > > Hmm, not quite. After thinking a bit more on this issue, I
>
> think
>
> >> > > now that the problem is not in the metadata cache, but it is a
>
> more
>
> >> > > fundamental one: I'm effectively opening a file (and hence,
>
> reading
>
> >> > > metadata, either from cache or from disk) *before* locking it,
>
> and
>
> >> > > that
> >> > > will always lead to wrong results, irregardless of an existing
> >> > > cache or
> >> > > not.
> >> > >
> >> > > I can devise a couple of solutions for this. The first one is
>
> to
>
> >> > > add a
> >> > > new parameter to the H5Fopen to inform it that we want to lock
>
> the
>
> >> > > file
> >> > > as soon as the file descriptor is allocated and before reading
>
> any
>
> >> > > meta-information (either from disk or cache), but that implies
>
> an
>
> >> > > API change.
> >> > >
> >> > > The other solution is to increase the lazyness of the process
>
> of
>
> >> > > reading
> >> > > the metadata until it is absolutely needed by other functions.
>
> So,
>
> >> > > in essence, the H5Fopen() should only basically have to open
>
> the
>
> >> > > underlying file descriptor and that's all; then this
> >> > > descriptor
>
> can
>
> >> > > be manually locked and the file metadata should be read later
>
> on,
>
> >> > > when it is really needed.
> >> > >
> >> > > All in all, both approaches seems to need too much changes in
>
> HDF5.
>
> >> > > Perhaps a better venue is to find alternatives to do the
> >> > > locking
>
> in
>
> >> > > the
> >> > > application side instead of including the functionality in
> >> > > HDF5 itself.
> >> >
> >> > Those are both interesting ideas that I hadn't thought of.
>
> What I
>
> >> > was thinking was to evict all the metadata from the cache and
>
> then
>
> >> > re- read it from the file. This could be done at any point
> >> > after
>
> the
>
> >> > file was opened, although it would require that all objects in
>
> the
>
> >> > file be closed when the cache entries were evicted.
> >>
> >> Well, I suppose that my ignorance on the internals of HDF5 is
>
> preventing
>
> >> me understanding your solution. Let's suppose that we have 2
>
> processes
>
> >> on a multi-processor machine. Let's call them process 'a' and
> >> process 'b'. Both processes do the same thing: from time to time
>
> they
>
> >> open a HDF5 file, lock it, write something on it, and close it
> >> (unlocking it).
> >>
> >> If process 'a' gets the lock first, then process 'b' will *open*
>
> the
>
> >> file and will block until the file becomes unlocked. While
> >> process
>
> 'b'
>
> >> is waiting, process 'a' writes a bunch of data in the file. When
>
> 'a'
>
> >> finishes the writing and unlock the file then process 'b' unblocks
>
> and
>
> >> gets the lock. But, by then (and this is main point), process 'b'
> >> already has got internal information about the opened file that is
> >> outdated.
> >>
> >> The only way that I see to avoid the problem is that the
>
> information
>
> >> about the opened file in process 'b' would exclusively reside in
>
> the
>
> >> metadata cache; so by refreshing it (or evicting it) the new
>
> processes
>
> >> can get the correct information. However, that solution does
> >> imply that the HDF5 metadata cache is to be *shared* between both
>
> processes,
>
> >> and I don't think this would be the case.
> >>
> >> Hi Francesc,
> >>
> >> I have similar issues and I think you're right when you say that
>
> this
>
> >> should be solved at the application layer. It is pretty difficult
>
> when the
>
> >> library cannot manage its own space to have efficient locking.
> >> What
>
> if for
>
> >> example the user deletes the file? Or another process wants to
> >> move
>
> the
>
> >> file? For the moment I think it is difficult to deal with this
>
> effectively
>
> >> so I will try to solve it with the old hack of the flag: when a
>
> process
>
> >> enters the root (in my case also other major nodes in the tree) I
>
> set an
>
> >> attribute and the last thing the process does is to unset the
>
> attribute.
>
> >> This way I also know if there was an issue and writing failed.
> >
> > Setting a flag in the file is not sufficient. It's easy to
>
> imagine
>
> > race conditions where two processes simultaneously check for the
>
> presence of
>
> > the flag, determine it doesn't exist and set it, then proceed to
>
> modify the
>
> > file. Some other mechanism which guarantees exclusive access must
> > be
>
> used.
>
> > (And even then, you'll have to use the cache management strategies
>
> I
>
> > mentioned in an earlier mail).
> >
> > Note that we've given this a fair bit of thought at the HDF
>
> Group
>
> > and have some good solutions, but would need to get funding/patches
>
> for this
>
> > to get into the HDF5 library.
> >
> > Quincey
>
> I know it is not sufficient but locking will work only at the
> application
> level and AFAIK a portable solution will try to use whole file locks
> or
> separate lock files but that is done at a different level than HDF
> lib. For
> a process to decide what to do, it has to check the locks and the
> special
> attributes. Note also, that linking HDF5 statically, means each
> process'
> cache is different anyway.
>
> I am sure you've given it a good deal of thought but for the benefit
> of
> having single files and no services/daemons, efficient file locking
> is sacrificed and becomes cumbersome.
>
> Best Regards,
>
> -- dimitris
>
>
> ---------------------------------------------------------------------
>- This mailing list is for HDF software users discussion.
> To subscribe to this list, send a message to
> hdf-forum-subscribe@hdfgroup.org. To unsubscribe, send a message to
> hdf-forum-unsubscribe@hdfgroup.org.I also agree that the transaction scheme is what Ger describes and
definitely agree that this is actually the way to go. I cannot think of a
way that HDF5 could deal with the complexity of locking a file without FS
locks but maybe it's just me. In my case I allow only one process to write
to the file. This process has to acquire the lock first and set an attribute
to the particular object. Other processes can read but at least they know
that some objects are being changed at the time and the application can
decide to read the current state or wait. Also this gives me an indication
if a transaction was completed successfully, as I only support a finite
number of write transactions through my top level API.
Other ways could include lock files, but I like depending on one file or
diff files, but I have large datasets. And in general I (as I guess most of
other users of HDF5) have few and large transactions, that even if I could
write in parallel, they would after all be serialized when writing on the
disk I guess, so overall performance would be more or less the same and not
worth the effort to implement something more complex.
Thanks a lot!
-- dimitris
···
2008/9/11 Francesc Alted <faltet@pytables.com>
> 2008/9/10 Quincey Koziol <koziol@hdfgroup.org>
> > On Sep 10, 2008, at 12:35 PM, Dimitris Servis wrote:
> >> 2008/9/10 Francesc Alted <faltet@pytables.com>
> >> > On Sep 10, 2008, at 6:11 AM, Francesc Alted wrote: