hdf5 resilience

Konrad · July 18, 2011, 12:39pm

I wonder about the existing techniques aimed at avoiding data file
corruption, be it from concurrent writes or process/machine death. I am
currently employing serialized copy-on-write semantics, but this will only
work for infrequent writes on small files.

How do people get around these problems, perhaps there is mechanism that can
sit on top of HDF5 to take care of this for me?

Many thanks

···

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/hdf5-resilience-tp3179027p3179027.html
Sent from the hdf-forum mailing list archive at Nabble.com.

Quincey_Koziol · July 18, 2011, 1:30pm

Hi Konrad,

I wonder about the existing techniques aimed at avoiding data file
corruption, be it from concurrent writes or process/machine death. I am
currently employing serialized copy-on-write semantics, but this will only
work for infrequent writes on small files.

We are working on adding a couple of mechanisms to the HDF5 library to help with the process/machine death issue.

How do people get around these problems, perhaps there is mechanism that can
sit on top of HDF5 to take care of this for me?

I'm not certain, but the iRODS project might do what you want.

Quincey

···

On Jul 18, 2011, at 7:39 AM, Konrad wrote:

Konrad · July 18, 2011, 2:59pm

Thanks Quincey,

I presume you're talking about the concurrent writer/multiple concurrent
readers feature. May I ask when it's due? Will the API change, will this
be enabled by default, or configurable?
Cheers

···

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/hdf5-resilience-tp3179027p3179436.html
Sent from the hdf-forum mailing list archive at Nabble.com.

dsdale24 · July 18, 2011, 2:34pm

Is there any information concerning data corruption in hdf5 from
concurrent writes? I just visited with a group that uses a distributed
data acquisition system to create and update hdf5 files, in which case
different processes could attempt to update different portions of an
hdf5 file concurrently. Is there any discussion in the hdf5
documentation that would address this issue?

Thanks,
Darren

···

On Mon, Jul 18, 2011 at 9:30 AM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Konrad,

On Jul 18, 2011, at 7:39 AM, Konrad wrote:

I wonder about the existing techniques aimed at avoiding data file
corruption, be it from concurrent writes or process/machine death. I am
currently employing serialized copy-on-write semantics, but this will only
work for infrequent writes on small files.
   We are working on adding a couple of mechanisms to the HDF5 library to help with the process/machine death issue\.

Quincey_Koziol · August 16, 2011, 7:46pm

Hi Konrad,

Thanks Quincey,

I presume you're talking about the concurrent writer/multiple concurrent
readers feature.

Yes.

May I ask when it's due? Will the API change, will this be enabled by default, or configurable?

It'll be available as part of the 1.10.0 release, which we're hoping to have a beta release around November. The API may change slightly, yes, but I'm still not certain exactly how - we're looking for a way of making it the smallest change to the API that is possible. It'll definitely be configurable (although we may decide to enable it by default, if the positive impacts outweigh any negatives).

Quincey

···

On Jul 18, 2011, at 9:59 AM, Konrad wrote:

Quincey_Koziol · August 16, 2011, 7:44pm

Hi Darren,

···

On Jul 18, 2011, at 9:34 AM, Darren Dale wrote:

On Mon, Jul 18, 2011 at 9:30 AM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Konrad,

On Jul 18, 2011, at 7:39 AM, Konrad wrote:

I wonder about the existing techniques aimed at avoiding data file
corruption, be it from concurrent writes or process/machine death. I am
currently employing serialized copy-on-write semantics, but this will only
work for infrequent writes on small files.

We are working on adding a couple of mechanisms to the HDF5 library to help with the process/machine death issue.

Is there any information concerning data corruption in hdf5 from
concurrent writes? I just visited with a group that uses a distributed
data acquisition system to create and update hdf5 files, in which case
different processes could attempt to update different portions of an
hdf5 file concurrently. Is there any discussion in the hdf5
documentation that would address this issue?

Sorry for the tardy reply - concurrent writes to HDF5 files are not supported (except under the MPI programming model). Here's our FAQ entry on how to get something like this working though: http://www.hdfgroup.org/hdf5-quest.html#gconc

Quincey