File locking revisited

Dimitris_Servis · November 12, 2009, 7:29am

Hi all,

after having implemented a nice file locking mechanism that locks on lock
files instead of the HDF5 file and manages to correctly coordinate access
between computers on shared drives, I realized that this mechanism is
actually not much good... the reason being that I expect the use case where
the user wants to read a file but does not have write permissions to the
directory cannot be ruled out. Obviously without write permissions to the
directory one cannot create a lock and read from that file which is quite
controversial in its own right. I therefore plan to turn away from the
separate lock file. I am reluctant on locking the file itself, as especially
on Linux any lock on a handle to a file opened by HDF5 library using fcntl
will be released as soon as the library releases one of its own locks. I do
not know to which extent this is probable. One question is also if it would
be fine to lock the file from a new file handle opened on the file using
system functions for exactly this purpose while the file is already opened
by HDF5. Any thoughts on that would be highly appreciated.

Another idea that came up to me while over the scratchpad again is this:
what I want is multiple processes being able to read from the file reliably
but not what another process is writing. So for example processes want to
read /foo/bar/one and /foo/bar/two datasets while another process is writing
in /foo/bar/three. In my case datasets one and two are completed and there's
no chance of a process wanting to write anything there. So these can safely
be read while a process is writing something else I guess. The only danger
lies with the case where reading processes realize that all the three
datasets exist based on the metadata but "three" is still being written. I
could place an attribute on "three" that would indicate that this is still
being written. This has the drawback that the attribute has to be removed
and all reading has to start by checking this attribute. Also I cannot
exclude the case that while a process is reading dataset "three" which did
not have the special attribute in the beginning, another process would like
to continue writing there. Maybe this is safe?

A final approach is this: start writing the new collection of data somewhere
else, like /tmp/hostname/process/thread/timestamp/three. A process can work
there for as much as it wants while other processes open and close the file
or even keep it open and read in other areas. When the writer finishes, it
links /foo/bar/three to /tmp/hostname/process/thread/timestamp/three in one
atomic (hopefully) operation. Any processes opening or reopening the file,
only then will they realize there's a new dataset and read it safely while
the writer is on to his next task. If this works, it seems to me as the
safest and most portable idea. The risk here is that if the writing process
goes down, you're left with some garbage in /tmp...

Any other ideas/suggestions are mostly welcome.

Thanks a lot!

-- dimitris

werner · November 12, 2009, 7:47am

Hi Dimtris,

just a quick thought. One major issue with one writer and many concurrent
readers will certainly the internal caching of the writer process. This
one might need to flush the hdf5 caches at each atomic operation, making
the writing quite inefficient (if sufficient to achieve concurrency at all).
Maybe it could make sense to explore other virtual file drivers as the
default one, such as the family file driver, which would place metadata
in one file, datasets in another file, but all being still a logical HDF5
file. Though each of these physical files in the file system could be
locked independently by OS functions. Never tried the family file driver
myself, however.

Werner

···

On Thu, 12 Nov 2009 01:29:57 -0600, Dimitris Servis <servisster@gmail.com> wrote:

Hi all,

after having implemented a nice file locking mechanism that locks on lock
files instead of the HDF5 file and manages to correctly coordinate access
between computers on shared drives, I realized that this mechanism is
actually not much good... the reason being that I expect the use case where
the user wants to read a file but does not have write permissions to the
directory cannot be ruled out. Obviously without write permissions to the
directory one cannot create a lock and read from that file which is quite
controversial in its own right. I therefore plan to turn away from the
separate lock file. I am reluctant on locking the file itself, as especially
on Linux any lock on a handle to a file opened by HDF5 library using fcntl
will be released as soon as the library releases one of its own locks. I do
not know to which extent this is probable. One question is also if it would
be fine to lock the file from a new file handle opened on the file using
system functions for exactly this purpose while the file is already opened
by HDF5. Any thoughts on that would be highly appreciated.

Another idea that came up to me while over the scratchpad again is this:
what I want is multiple processes being able to read from the file reliably
but not what another process is writing. So for example processes want to
read /foo/bar/one and /foo/bar/two datasets while another process is writing
in /foo/bar/three. In my case datasets one and two are completed and there's
no chance of a process wanting to write anything there. So these can safely
be read while a process is writing something else I guess. The only danger
lies with the case where reading processes realize that all the three
datasets exist based on the metadata but "three" is still being written. I
could place an attribute on "three" that would indicate that this is still
being written. This has the drawback that the attribute has to be removed
and all reading has to start by checking this attribute. Also I cannot
exclude the case that while a process is reading dataset "three" which did
not have the special attribute in the beginning, another process would like
to continue writing there. Maybe this is safe?

A final approach is this: start writing the new collection of data somewhere
else, like /tmp/hostname/process/thread/timestamp/three. A process can work
there for as much as it wants while other processes open and close the file
or even keep it open and read in other areas. When the writer finishes, it
links /foo/bar/three to /tmp/hostname/process/thread/timestamp/three in one
atomic (hopefully) operation. Any processes opening or reopening the file,
only then will they realize there's a new dataset and read it safely while
the writer is on to his next task. If this works, it seems to me as the
safest and most portable idea. The risk here is that if the writing process
goes down, you're left with some garbage in /tmp...

Any other ideas/suggestions are mostly welcome.

Thanks a lot!

-- dimitris

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Dimitris_Servis · November 12, 2009, 9:42am

Hi Werner,

thanks a lot for your thoughts! Actually AFAIK the writer not only has to
flush but actually close the file to make sure everything is written
properly into the file. At least that was the case as far as I remember.

Using the family file driver could be a solution, but one of my major
non-functional requirements is to have one file that the user can move
around.

The problem with concurrent reading and writing exists only in the case
where the reader can read what the writer is currently writing. If the
reader is only interested in complete datasets which are guaranteed to
either exist or not, for me it's fine.

Thanks again!

-- dimitris

···

2009/11/12 Werner Benger <werner@cct.lsu.edu>

Hi Dimtris,

just a quick thought. One major issue with one writer and many concurrent
readers will certainly the internal caching of the writer process. This
one might need to flush the hdf5 caches at each atomic operation, making
the writing quite inefficient (if sufficient to achieve concurrency at
all).
Maybe it could make sense to explore other virtual file drivers as the
default one, such as the family file driver, which would place metadata
in one file, datasets in another file, but all being still a logical HDF5
file. Though each of these physical files in the file system could be
locked independently by OS functions. Never tried the family file driver
myself, however.

Werner

On Thu, 12 Nov 2009 01:29:57 -0600, Dimitris Servis <servisster@gmail.com> > wrote:

Hi all,

after having implemented a nice file locking mechanism that locks on lock
files instead of the HDF5 file and manages to correctly coordinate access
between computers on shared drives, I realized that this mechanism is
actually not much good... the reason being that I expect the use case
where
the user wants to read a file but does not have write permissions to the
directory cannot be ruled out. Obviously without write permissions to the
directory one cannot create a lock and read from that file which is quite
controversial in its own right. I therefore plan to turn away from the
separate lock file. I am reluctant on locking the file itself, as
especially
on Linux any lock on a handle to a file opened by HDF5 library using fcntl
will be released as soon as the library releases one of its own locks. I
do
not know to which extent this is probable. One question is also if it
would
be fine to lock the file from a new file handle opened on the file using
system functions for exactly this purpose while the file is already opened
by HDF5. Any thoughts on that would be highly appreciated.

Another idea that came up to me while over the scratchpad again is this:
what I want is multiple processes being able to read from the file
reliably
but not what another process is writing. So for example processes want to
read /foo/bar/one and /foo/bar/two datasets while another process is
writing
in /foo/bar/three. In my case datasets one and two are completed and
there's
no chance of a process wanting to write anything there. So these can
safely
be read while a process is writing something else I guess. The only danger
lies with the case where reading processes realize that all the three
datasets exist based on the metadata but "three" is still being written. I
could place an attribute on "three" that would indicate that this is still
being written. This has the drawback that the attribute has to be removed
and all reading has to start by checking this attribute. Also I cannot
exclude the case that while a process is reading dataset "three" which did
not have the special attribute in the beginning, another process would
like
to continue writing there. Maybe this is safe?

A final approach is this: start writing the new collection of data
somewhere
else, like /tmp/hostname/process/thread/timestamp/three. A process can
work
there for as much as it wants while other processes open and close the
file
or even keep it open and read in other areas. When the writer finishes, it
links /foo/bar/three to /tmp/hostname/process/thread/timestamp/three in
one
atomic (hopefully) operation. Any processes opening or reopening the file,
only then will they realize there's a new dataset and read it safely while
the writer is on to his next task. If this works, it seems to me as the
safest and most portable idea. The risk here is that if the writing
process
goes down, you're left with some garbage in /tmp...

Any other ideas/suggestions are mostly welcome.

Thanks a lot!

-- dimitris

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Quincey_Koziol · November 12, 2009, 6:14pm

Hi Dimitris,

Hi all,

after having implemented a nice file locking mechanism that locks on lock files instead of the HDF5 file and manages to correctly coordinate access between computers on shared drives, I realized that this mechanism is actually not much good... the reason being that I expect the use case where the user wants to read a file but does not have write permissions to the directory cannot be ruled out. Obviously without write permissions to the directory one cannot create a lock and read from that file which is quite controversial in its own right. I therefore plan to turn away from the separate lock file. I am reluctant on locking the file itself, as especially on Linux any lock on a handle to a file opened by HDF5 library using fcntl will be released as soon as the library releases one of its own locks. I do not know to which extent this is probable. One question is also if it would be fine to lock the file from a new file handle opened on the file using system functions for exactly this purpose while the file is already opened by HDF5. Any thoughts on that would be highly appreciated.

I don't know about your last question (about locking the file from a new file handle), but I wanted to comment that HDF5 doesn't do any fcntl() operations on the file handle itself (currently), so adding your own locks should be OK.

Another idea that came up to me while over the scratchpad again is this: what I want is multiple processes being able to read from the file reliably but not what another process is writing. So for example processes want to read /foo/bar/one and /foo/bar/two datasets while another process is writing in /foo/bar/three. In my case datasets one and two are completed and there's no chance of a process wanting to write anything there. So these can safely be read while a process is writing something else I guess. The only danger lies with the case where reading processes realize that all the three datasets exist based on the metadata but "three" is still being written. I could place an attribute on "three" that would indicate that this is still being written. This has the drawback that the attribute has to be removed and all reading has to start by checking this attribute. Also I cannot exclude the case that while a process is reading dataset "three" which did not have the special attribute in the beginning, another process would like to continue writing there. Maybe this is safe?

This is not safe currently, but we are working on adding "single-writer/multiple-reader" (SWMR) access to HDF5 file operations that don't require locks, as long as the file system provides POSIX consistency semantics (ordered, atomic file read & write operations). Hopefully this will be available on all operations to the HDF5 file in the 1.10.0 release, but it might be limited to a subset of operations, we'll see... As you mention, this still may require some actions by the reader, in order to manage it's view of the metadata for an object, but it should greatly ease the situation.

Quincey

···

On Nov 12, 2009, at 1:29 AM, Dimitris Servis wrote:

A final approach is this: start writing the new collection of data somewhere else, like /tmp/hostname/process/thread/timestamp/three. A process can work there for as much as it wants while other processes open and close the file or even keep it open and read in other areas. When the writer finishes, it links /foo/bar/three to /tmp/hostname/process/thread/timestamp/three in one atomic (hopefully) operation. Any processes opening or reopening the file, only then will they realize there's a new dataset and read it safely while the writer is on to his next task. If this works, it seems to me as the safest and most portable idea. The risk here is that if the writing process goes down, you're left with some garbage in /tmp...

Any other ideas/suggestions are mostly welcome.

Thanks a lot!

-- dimitris
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Francesc_Alted2 · November 12, 2009, 1:28pm

Hi Dimitris,

A Thursday 12 November 2009 08:29:57 Dimitris Servis escrigué:
[clip]

A final approach is this: start writing the new collection of data
somewhere else, like /tmp/hostname/process/thread/timestamp/three. A
process can work there for as much as it wants while other processes open
and close the file or even keep it open and read in other areas. When the
writer finishes, it links /foo/bar/three to
/tmp/hostname/process/thread/timestamp/three in one atomic (hopefully)
operation. Any processes opening or reopening the file, only then will they
realize there's a new dataset and read it safely while the writer is on to
his next task. If this works, it seems to me as the safest and most
portable idea. The risk here is that if the writing process goes down,
you're left with some garbage in /tmp...

Yeah. I think this latter option would be the safest. You don't even need to
link, just to move (H5Gmove2) the dataset to the correct place. And well, if
the writing process goes down, I think your smallest problem would be dealing
with some garbage in /tmp

You probably have considered it already, but another possibility is to use
OPeNDAP (http://opendap.org) and let it be in charge of all the locking
problems.

Hope that helps,

···

--
Francesc Alted

Francesc_Alted2 · November 12, 2009, 4:44pm

A Thursday 12 November 2009 08:47:50 Werner Benger escrigué:

Hi Dimtris,

just a quick thought. One major issue with one writer and many concurrent
readers will certainly the internal caching of the writer process.

Oops, you are right.

This
one might need to flush the hdf5 caches at each atomic operation, making
the writing quite inefficient (if sufficient to achieve concurrency at
all).

Another possibility is to force the *reading* processes to close and re-open
the file each time they have to access data in file. That way they can avoid
the cache pitfalls.

If re-opening is not possible, then I suppose that the best solution would be
to go to a client/server solution (via either OPeNDAP or home-grown server).

···

--
Francesc Alted

werner · November 12, 2009, 3:19pm

Hi Dimitris,

just wondering, if in this context the Streaming VFD could be of help/use?
Thereby the writer sending data not to a file, but to a server, which
then as the same process also serves the clients via a TCP stream
instead of file access. Physically one file, served via the network
to many.

Werner

···

On Thu, 12 Nov 2009 03:42:21 -0600, Dimitris Servis <servisster@gmail.com> wrote:

Hi Werner,

thanks a lot for your thoughts! Actually AFAIK the writer not only has to
flush but actually close the file to make sure everything is written
properly into the file. At least that was the case as far as I remember.

Using the family file driver could be a solution, but one of my major
non-functional requirements is to have one file that the user can move
around.

The problem with concurrent reading and writing exists only in the case
where the reader can read what the writer is currently writing. If the
reader is only interested in complete datasets which are guaranteed to
either exist or not, for me it's fine.

Thanks again!

-- dimitris

2009/11/12 Werner Benger <werner@cct.lsu.edu>

Hi Dimtris,

just a quick thought. One major issue with one writer and many concurrent
readers will certainly the internal caching of the writer process. This
one might need to flush the hdf5 caches at each atomic operation, making
the writing quite inefficient (if sufficient to achieve concurrency at
all).
Maybe it could make sense to explore other virtual file drivers as the
default one, such as the family file driver, which would place metadata
in one file, datasets in another file, but all being still a logical HDF5
file. Though each of these physical files in the file system could be
locked independently by OS functions. Never tried the family file driver
myself, however.

Werner

On Thu, 12 Nov 2009 01:29:57 -0600, Dimitris Servis <servisster@gmail.com> >> wrote:

Hi all,

after having implemented a nice file locking mechanism that locks on lock
files instead of the HDF5 file and manages to correctly coordinate access
between computers on shared drives, I realized that this mechanism is
actually not much good... the reason being that I expect the use case
where
the user wants to read a file but does not have write permissions to the
directory cannot be ruled out. Obviously without write permissions to the
directory one cannot create a lock and read from that file which is quite
controversial in its own right. I therefore plan to turn away from the
separate lock file. I am reluctant on locking the file itself, as
especially
on Linux any lock on a handle to a file opened by HDF5 library using fcntl
will be released as soon as the library releases one of its own locks. I
do
not know to which extent this is probable. One question is also if it
would
be fine to lock the file from a new file handle opened on the file using
system functions for exactly this purpose while the file is already opened
by HDF5. Any thoughts on that would be highly appreciated.

Another idea that came up to me while over the scratchpad again is this:
what I want is multiple processes being able to read from the file
reliably
but not what another process is writing. So for example processes want to
read /foo/bar/one and /foo/bar/two datasets while another process is
writing
in /foo/bar/three. In my case datasets one and two are completed and
there's
no chance of a process wanting to write anything there. So these can
safely
be read while a process is writing something else I guess. The only danger
lies with the case where reading processes realize that all the three
datasets exist based on the metadata but "three" is still being written. I
could place an attribute on "three" that would indicate that this is still
being written. This has the drawback that the attribute has to be removed
and all reading has to start by checking this attribute. Also I cannot
exclude the case that while a process is reading dataset "three" which did
not have the special attribute in the beginning, another process would
like
to continue writing there. Maybe this is safe?

A final approach is this: start writing the new collection of data
somewhere
else, like /tmp/hostname/process/thread/timestamp/three. A process can
work
there for as much as it wants while other processes open and close the
file
or even keep it open and read in other areas. When the writer finishes, it
links /foo/bar/three to /tmp/hostname/process/thread/timestamp/three in
one
atomic (hopefully) operation. Any processes opening or reopening the file,
only then will they realize there's a new dataset and read it safely while
the writer is on to his next task. If this works, it seems to me as the
safest and most portable idea. The risk here is that if the writing
process
goes down, you're left with some garbage in /tmp...

Any other ideas/suggestions are mostly welcome.

Thanks a lot!

-- dimitris

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Dimitris_Servis · November 12, 2009, 9:29pm

Hi Quincey,

thanks for the response.

I don't know about your last question (about locking the file from a

new file handle), but I wanted to comment that HDF5 doesn't do any fcntl()
operations on the file handle itself (currently), so adding your own locks
should be OK.

I'll try that and let you know.

This is not safe currently, but we are working on adding
"single-writer/multiple-reader" (SWMR) access to HDF5 file operations that
don't require locks, as long as the file system provides POSIX consistency
semantics (ordered, atomic file read & write operations). Hopefully this
will be available on all operations to the HDF5 file in the 1.10.0 release,
but it might be limited to a subset of operations, we'll see... As you
mention, this still may require some actions by the reader, in order to
manage it's view of the metadata for an object, but it should greatly ease
the situation.

Yes, I recall a relevant discussion. Would it make sense to provide the
capability (maybe combined with the upcoming journaling capability?) to
allow the user to control the granularity of an atomic transaction? What I
mean is that if any number of readers have a view of the data, and this view
is still valid after the writer has written, the writer could write in such
a temporary area and commit the changes in one atomic transaction. I am not
sure, but I would expect that the use case of many readers reading
continuously what a writer writes to a chunked dataset is less common than
allowing the readers to read only when the transaction has been
completed.... Anyway, thanks for the tips and update!

Best

-- dimitris

···

Quincey

> A final approach is this: start writing the new collection of data
somewhere else, like /tmp/hostname/process/thread/timestamp/three. A process
can work there for as much as it wants while other processes open and close
the file or even keep it open and read in other areas. When the writer
finishes, it links /foo/bar/three to
/tmp/hostname/process/thread/timestamp/three in one atomic (hopefully)
operation. Any processes opening or reopening the file, only then will they
realize there's a new dataset and read it safely while the writer is on to
his next task. If this works, it seems to me as the safest and most portable
idea. The risk here is that if the writing process goes down, you're left
with some garbage in /tmp...
>
> Any other ideas/suggestions are mostly welcome.
>
> Thanks a lot!
>
> -- dimitris
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@hdfgroup.org
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Quincey_Koziol · November 12, 2009, 10:15pm

Hi Dimitris,

···

On Nov 12, 2009, at 3:29 PM, Dimitris Servis wrote:

Hi Quincey,

thanks for the response.

This is not safe currently, but we are working on adding "single-writer/multiple-reader" (SWMR) access to HDF5 file operations that don't require locks, as long as the file system provides POSIX consistency semantics (ordered, atomic file read & write operations). Hopefully this will be available on all operations to the HDF5 file in the 1.10.0 release, but it might be limited to a subset of operations, we'll see... As you mention, this still may require some actions by the reader, in order to manage it's view of the metadata for an object, but it should greatly ease the situation.

Yes, I recall a relevant discussion. Would it make sense to provide the capability (maybe combined with the upcoming journaling capability?) to allow the user to control the granularity of an atomic transaction? What I mean is that if any number of readers have a view of the data, and this view is still valid after the writer has written, the writer could write in such a temporary area and commit the changes in one atomic transaction. I am not sure, but I would expect that the use case of many readers reading continuously what a writer writes to a chunked dataset is less common than allowing the readers to read only when the transaction has been completed.... Anyway, thanks for the tips and update!

Currently, we're planning on three granularities for updates to the file, when using SWMR access:
- Incremental - "normal" metadata flushes & evictions, with no explicit action by an application. This is equivalent to the current metadata cache flush & evictions, just with the proper ordering of the metadata written to the file.

- Per object - the application will have a way to specify that the HDF5 library flush all the cached metadata for a given HDF5 object

- Whole file - calls to H5Fflush() will result in all the cached metadata being written to the file, as they do now, just in the proper order so that readers won't get confused.

How's that sound?

Quincey

Dimitris_Servis · November 16, 2009, 8:24pm

Hi Quincey,

Currently, we're planning on three granularities for updates to the file,
when using SWMR access:
- Incremental - "normal" metadata flushes & evictions, with no explicit
action by an application. This is equivalent to the current metadata cache
flush & evictions, just with the proper ordering of the metadata written to
the file.

- Per object - the application will have a way to specify that the HDF5
library flush all the cached metadata for a given HDF5 object

- Whole file - calls to H5Fflush() will result in all the cached metadata
being written to the file, as they do now, just in the proper order so that
readers won't get confused.

How's that sound?

thanks for the response. It sounds good. My idea could be implemented using
per object flushing I guess, but it could be implemented on the library
side. I would like a function set like

H5O_begin_transaction()
H5O_end_transaction()

Where all the objects modified in between would be held in an array of
objects or the like and flushed in one atomic transaction at the end. I
would prefer something like this on the library side where it could be
significantly easier to satisfy the following:

1. The view of readers for the current or other processes does not change
between the calls to these two functions
2. H5O_end_transaction is really (really) atomic: either all succeed or all
fail.

I'm not sure how that sounds to you though and if it makes sense for others
too....

Thanks a lot!

-- dimitris

Dimitris_Servis · November 17, 2009, 5:05pm

Quincey,

another question:

- Whole file - calls to H5Fflush() will result in all the cached metadata
being written to the file, as they do now, just in the proper order so that
readers won't get confused.

I recall your writing that "Note that it's not [currently] enough to just
flush the data from an open file when multiple writers are involved - the
file must actually be closed by one writer before being opened by another
writer." Is this still the case?

Thanks a lot!

Dimitris Servis

Quincey_Koziol · November 17, 2009, 9:14pm

Hi Dimitris,

···

On Nov 16, 2009, at 2:24 PM, Dimitris Servis wrote:

Hi Quincey,

  Currently, we're planning on three granularities for updates to the file, when using SWMR access:
  - Incremental - "normal" metadata flushes & evictions, with no explicit action by an application. This is equivalent to the current metadata cache flush & evictions, just with the proper ordering of the metadata written to the file.

  - Per object - the application will have a way to specify that the HDF5 library flush all the cached metadata for a given HDF5 object

  - Whole file - calls to H5Fflush() will result in all the cached metadata being written to the file, as they do now, just in the proper order so that readers won't get confused.

  How's that sound?

thanks for the response. It sounds good. My idea could be implemented using per object flushing I guess, but it could be implemented on the library side. I would like a function set like

H5O_begin_transaction()
H5O_end_transaction()

Where all the objects modified in between would be held in an array of objects or the like and flushed in one atomic transaction at the end. I would prefer something like this on the library side where it could be significantly easier to satisfy the following:

1. The view of readers for the current or other processes does not change between the calls to these two functions
2. H5O_end_transaction is really (really) atomic: either all succeed or all fail.

I'm not sure how that sounds to you though and if it makes sense for others too....

Interesting idea, sort of at the middle-ground between the per object and whole file approaches. It doesn't fit any [funded] use cases, but I'll try to keep it in mind as we work on other related things.

Quincey

Quincey_Koziol · November 17, 2009, 9:15pm

Hi Dimitris,

···

On Nov 17, 2009, at 11:05 AM, Dimitris Servis wrote:

Quincey,

another question:
- Whole file - calls to H5Fflush() will result in all the cached metadata being written to the file, as they do now, just in the proper order so that readers won't get confused.

I recall your writing that "Note that it's not [currently] enough to just flush the data from an open file when multiple writers are involved - the file must actually be closed by one writer before being opened by another writer." Is this still the case?

Writing to the file will still be limited to a single writer, yes.

Quincey

Dimitris_Servis · November 17, 2009, 11:41pm

Hi Quincey,

thanks for the response. It is clear that there will just be one writer at a
time. But what I conclude from your text is that you also have to close the
file between writes - not just flush. The reason I am asking is that I have
an odd case where writing variable length arrays and trying to read them
back right after works only if I don't close the file in between. That is
with 1.8.2 though, it might have been fixed meanwhile.

Thanks a lot

-- dimitris

···

2009/11/17 Quincey Koziol <koziol@hdfgroup.org>

Hi Dimitris,

On Nov 17, 2009, at 11:05 AM, Dimitris Servis wrote:

Quincey,

another question:

- Whole file - calls to H5Fflush() will result in all the cached
metadata being written to the file, as they do now, just in the proper order
so that readers won't get confused.

I recall your writing that "Note that it's not [currently] enough to just
flush the data from an open file when multiple writers are involved - the
file must actually be closed by one writer before being opened by another
writer." Is this still the case?

Writing to the file will still be limited to a single writer, yes.

Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Quincey_Koziol · November 17, 2009, 11:44pm

Hi Dimitris,

Hi Quincey,

thanks for the response. It is clear that there will just be one writer at a time. But what I conclude from your text is that you also have to close the file between writes - not just flush. The reason I am asking is that I have an odd case where writing variable length arrays and trying to read them back right after works only if I don't close the file in between. That is with 1.8.2 though, it might have been fixed meanwhile.

No, flushing (on an object or file basis) is all that's necessary, the file doesn't need to be closed. I believe we've fixed the problem with the variable-length sequences in the latest release.

Quincey

···

On Nov 17, 2009, at 5:41 PM, Dimitris Servis wrote:

Thanks a lot

-- dimitris

2009/11/17 Quincey Koziol <koziol@hdfgroup.org>
Hi Dimitris,

On Nov 17, 2009, at 11:05 AM, Dimitris Servis wrote:

Quincey,

another question:
  - Whole file - calls to H5Fflush() will result in all the cached metadata being written to the file, as they do now, just in the proper order so that readers won't get confused.

I recall your writing that "Note that it's not [currently] enough to just flush the data from an open file when multiple writers are involved - the file must actually be closed by one writer before being opened by another writer." Is this still the case?

  Writing to the file will still be limited to a single writer, yes.

    Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org