- C++HDF5, concurrency and file corruption -

Hi HDF group -

We have been trying to resolve a problem when using HDF5 in highly
concurrent application processing. We have a custom C++ wrapper on top of
HDF5-C++, a custom locking mechanism (using LOCK files). These are run
in our test systems. Actual production runs may not be so highly
concurrent.

Test data

···

----------------------
1. 500 -1000 jobs running (1 job running 1 instance of the app) - all of
them using/accessing the same HDF5 files.

About the Tests
------------------------
These tests are routed through a job management system - the test runs on
a 3 compute node each with 8 cores.

Results of Tests
------------------------
Behaviour1: Nesting - SubGroups and its DataSets on a parent Group
Node - appears under another parent Group Node.
                                             Opening HDF file using
HDFviewer, we see that the object ID of the nested groups/datasets are the
same as the original datasets/groups.
                             NOTE: It appears that somehow HDF5 is
creating a link in these cases( Our Custom C++ wrapper does not make use
of link functionality from HDF5) .
                             Attached screenshot.

Behaviour2: Corruption - We have found that in certain runs, we get these
HDF5 files corrupted, we are assuming that maybe a concurrent write/create
might be the problem.
                        Corrupted files cannot be opened. Any insights
into this ?

Questions:
------------------------
1. In what scenario does HDF interfaces automatically detect and create a
soft link ?
2. Have you come across any such weird occurences before ?

Thanks
Anish

2 posts were merged into an existing topic: - C++HDF5, concurrency and file corruption -

Hi Anish,

Hi HDF group -

We have been trying to resolve a problem when using HDF5 in highly concurrent application processing. We have a custom C++ wrapper on top of HDF5-C++, a custom locking mechanism (using LOCK files). These are run in our test systems. Actual production runs may not be so highly concurrent.

Test data
----------------------
1. 500 -1000 jobs running (1 job running 1 instance of the app) - all of them using/accessing the same HDF5 files.

About the Tests
------------------------
These tests are routed through a job management system - the test runs on a 3 compute node each with 8 cores.

Results of Tests
------------------------
Behaviour1: Nesting - SubGroups and its DataSets on a parent Group Node - appears under another parent Group Node.
                                             Opening HDF file using HDFviewer, we see that the object ID of the nested groups/datasets are the same as the original datasets/groups.
                             NOTE: It appears that somehow HDF5 is creating a link in these cases( Our Custom C++ wrapper does not make use of link functionality from HDF5) .
                             Attached screenshot.

Behaviour2: Corruption - We have found that in certain runs, we get these HDF5 files corrupted, we are assuming that maybe a concurrent write/create might be the problem.
                        Corrupted files cannot be opened. Any insights into this ?

Questions:
------------------------
1. In what scenario does HDF interfaces automatically detect and create a soft link ?

  None.

2. Have you come across any such weird occurences before ?

  It does sound like you have a situation where the same file is being manipulated by multiple writers at the same time. Note that it's not [currently] enough to just flush the data from an open file when multiple writers are involved - the file must actually be closed by one writer before being opened by another writer.

  Quincey

···

On Nov 26, 2008, at 5:08 PM, Anish Anto wrote:

Thanks
Anish<r22p3_corruptSPCH.png>----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.