concurrent reads?

I've googled a bit, but haven't had much luck finding a definitive answer.

From these discussions/FAQ topics...
http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-June/003211.html
http://www.hdfgroup.org/hdf5-quest.html#grdwt
http://www.hdfgroup.org/hdf5-quest.html#tsafe
http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-March/002804.html

It seems like the current status is that I can't do concurrent reads within the same file. Is that the case?

But, concurrent reads from separate files by the same process are OK, right?

My use case:
* GIS data that has been clipped up to geospatial boundaries (say a 1x1 arcminute grid), and written to HDF5 such that each grid element has its own dataset.
* We'll likely split N grid elements into their own files.
* I'll have many (8+) readers but no writers. Each thread would be responsible for checking for and pulling data from a particular dataset. i.e. there wouldn't be multiple readers pulling separate chunks of data from the same dataset.

Thanks,
Steve

We've encountered errors attempting reads from the same file from multiple threads in the same process, even to different datasets, running under Windows 64. We've never tried reading from multiple files using the same process.

···

On Sep 23, 2010, at 1:44 PM, stnchris@xmission.com wrote:

I've googled a bit, but haven't had much luck finding a definitive answer.

From these discussions/FAQ topics...
http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-June/003211.html
http://www.hdfgroup.org/hdf5-quest.html#grdwt
http://www.hdfgroup.org/hdf5-quest.html#tsafe
http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-March/002804.html

It seems like the current status is that I can't do concurrent reads within the same file. Is that the case?

But, concurrent reads from separate files by the same process are OK, right?

My use case:
* GIS data that has been clipped up to geospatial boundaries (say a 1x1 arcminute grid), and written to HDF5 such that each grid element has its own dataset.
* We'll likely split N grid elements into their own files.
* I'll have many (8+) readers but no writers. Each thread would be responsible for checking for and pulling data from a particular dataset. i.e. there wouldn't be multiple readers pulling separate chunks of data from the same dataset.

Thanks,
Steve

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Concurrent reads from separate files are okay from the same process with a threadsafe build of the HDF5 library (which effectively serializes the calls as noted in one of your references.)

Concurrent reads from separate files are *not* okay from the same process with a non-threadsafe build of the HDF5 library.

An explanation cut from an earlier email outside the forum. Will request an FAQ entry since this is somewhat unexpected w/o the explanation:

···

There are places where the HDF5 library modifies global data structures that are independent of a particular HDF5 file, and we rely on the semaphore around the library API calls to protect the data structure from being corrupted by simultaneous manipulation from different threads. An example of this would be the HDF5 library's freespace manager; another is the open file list.

On Sep 23, 2010, at 5:35 PM, Sebastian Good wrote:

We've encountered errors attempting reads from the same file from multiple threads in the same process, even to different datasets, running under Windows 64. We've never tried reading from multiple files using the same process.

On Sep 23, 2010, at 1:44 PM, stnchris@xmission.com wrote:

I've googled a bit, but haven't had much luck finding a definitive answer.

From these discussions/FAQ topics...
http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-June/003211.html
http://www.hdfgroup.org/hdf5-quest.html#grdwt
http://www.hdfgroup.org/hdf5-quest.html#tsafe
http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-March/002804.html

It seems like the current status is that I can't do concurrent reads within the same file. Is that the case?

But, concurrent reads from separate files by the same process are OK, right?

My use case:
* GIS data that has been clipped up to geospatial boundaries (say a 1x1 arcminute grid), and written to HDF5 such that each grid element has its own dataset.
* We'll likely split N grid elements into their own files.
* I'll have many (8+) readers but no writers. Each thread would be responsible for checking for and pulling data from a particular dataset. i.e. there wouldn't be multiple readers pulling separate chunks of data from the same dataset.

Thanks,
Steve

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org