Concurrent access

Are there plans to add support for concurrent access to HDF5 files (using flock or so) instead of having to open/close the files?

Another thing I'm wondering bout is the behaviour when a file is opened again in the same process. Will HDF5 notice or will there be 2 separate file objects, each with their own buffers resulting in overwriting each others updates?
A situation where that might occur is when opening a file, opening a data set, closing the file hid, and reopening the file again (e.g. for the concurrent access). I understood from the documentation that you can do this and I assume that closing the file hid does NOT result in closing the file because the dataset set still has a reference to the file hid. So you may end up with having the same file opened twice.
Can this indeed happen or will HDF5 notice?

Cheers,
Ger

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Ger,

Are there plans to add support for concurrent access to HDF5 files (using flock or so) instead of having to open/close the files?

  In a limited way - we are currently working on limited support for supporting single-writer/multiple-reader access to the raw data for chunked datasets with 1 unlimited dimension. This will allow the dataset to be extended & written to from a single process while other processes will have a consistent view for reading. If that turns out well, we may be able to proceed with getting more parts/operations of the file to support single-writer/multiple-reader access.

  We don't currently have funding for implementing multiple-writer access (with locking), but we've scoped it out a number of times (and ways :slight_smile: and would certainly be interested in working to get this sort of access funded.

Another thing I'm wondering about is the behaviour when a file is opened again in the same process. Will HDF5 notice or will there be 2 separate file objects, each with their own buffers resulting in overwriting each others updates?

  This is handled internally to the HDF5 library and should work fine. There's separate "top-level" file objects, but a common "bottom-level" file object, so the buffers are common to all accesses to a particular file.

A situation where that might occur is when opening a file, opening a data set, closing the file hid, and reopening the file again (e.g. for the concurrent access). I understood from the documentation that you can do this and I assume that closing the file hid does NOT result in closing the file because the dataset set still has a reference to the file hid. So you may end up with having the same file opened twice.
Can this indeed happen or will HDF5 notice?

  The HDF5 library will notice and not open the underlying file twice.

    Quincey

···

On Mar 18, 2008, at 3:18 AM, Ger van Diepen wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi... I want to create a dataset in my hdf5 file which is a compressed
string. (this is an XML file that I want to use rarely but it's
important to store it w/ the data) I can't seem to figure out the
compression. Does anyone have some C code they could show me to do
something like this? I must be missing something stupid.

Right now I am using this:

Datatype: htype = H5Tcopy(H5T_C_S1);
           H5Tset_size(htype, my_string_length);
Dataspace: hspace = H5Screate(H5S_SCALAR);

create_params = H5Pcreate(H5P_DATASET_CREATE);
hsize_t chunksize = 1;
H5Pset_chunk(create_params, 1, &chunksize);
hstring = H5Dcreate2(hgrp, my_dataset_name, htype, hspace, H5P_DEFAULT,
create_params, H5P_DEFAULT);

This call to H5Dcreate2 fails (I'm not sure yet how to get a more
specific error message), but if I replace "create_params" with
H5P_DEFAULT it works fine. (no compression)

any thoughts to help?

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Quincey,

Thanks for the answer.

A situation where that might occur is when opening a file, opening a
data set, closing the file hid, and reopening the file again (e.g.
for the concurrent access). I understood from the documentation that
you can do this and I assume that closing the file hid does NOT
result in closing the file because the dataset set still has a
reference to the file hid. So you may end up with having the same
file opened twice.
Can this indeed happen or will HDF5 notice?

  The HDF5 library will notice and not open the underlying file twice.

So it means that for concurrent access all datasets etc. in a file have to be closed, otherwise the file is not really closed and reopened.

Cheers,
Ger

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Yes, that's true, by default. If you want the last file ID closed to close all the open opened objects in the file (groups, datasets & named datatypes), you can use the H5Pset_fclose_degree() API routine.

  Quincey

···

On Mar 20, 2008, at 2:48 AM, Ger van Diepen wrote:

Hi Quincey,

Thanks for the answer.

A situation where that might occur is when opening a file, opening a
data set, closing the file hid, and reopening the file again (e.g.
for the concurrent access). I understood from the documentation that
you can do this and I assume that closing the file hid does NOT
result in closing the file because the dataset set still has a
reference to the file hid. So you may end up with having the same
file opened twice.
Can this indeed happen or will HDF5 notice?

  The HDF5 library will notice and not open the underlying file twice.

So it means that for concurrent access all datasets etc. in a file have to be closed, otherwise the file is not really closed and reopened.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.