We have been trying to resolve a problem when using HDF5 in highly
concurrent application processing. We have a custom C++ wrapper on top of
HDF5-C++, a custom locking mechanism (using LOCK files). These are run
in our test systems. Actual production runs may not be so highly
concurrent.
----------------------
1. 500 -1000 jobs running (1 job running 1 instance of the app) - all of
them using/accessing the same HDF5 files.
About the Tests
------------------------
These tests are routed through a job management system - the test runs on
a 3 compute node each with 8 cores.
Results of Tests
------------------------
Behaviour1: Nesting - SubGroups and its DataSets on a parent Group
Node - appears under another parent Group Node.
Opening HDF file using
HDFviewer, we see that the object ID of the nested groups/datasets are the
same as the original datasets/groups.
NOTE: It appears that somehow HDF5 is
creating a link in these cases( Our Custom C++ wrapper does not make use
of link functionality from HDF5) .
Attached screenshot.
Behaviour2: Corruption - We have found that in certain runs, we get these
HDF5 files corrupted, we are assuming that maybe a concurrent write/create
might be the problem.
Corrupted files cannot be opened. Any insights
into this ?
Questions:
------------------------
1. In what scenario does HDF interfaces automatically detect and create a
soft link ?
2. Have you come across any such weird occurences before ?
We have been trying to resolve a problem when using HDF5 in highly concurrent application processing. We have a custom C++ wrapper on top of HDF5-C++, a custom locking mechanism (using LOCK files). These are run in our test systems. Actual production runs may not be so highly concurrent.
Test data
----------------------
1. 500 -1000 jobs running (1 job running 1 instance of the app) - all of them using/accessing the same HDF5 files.
About the Tests
------------------------
These tests are routed through a job management system - the test runs on a 3 compute node each with 8 cores.
Results of Tests
------------------------
Behaviour1: Nesting - SubGroups and its DataSets on a parent Group Node - appears under another parent Group Node.
Opening HDF file using HDFviewer, we see that the object ID of the nested groups/datasets are the same as the original datasets/groups.
NOTE: It appears that somehow HDF5 is creating a link in these cases( Our Custom C++ wrapper does not make use of link functionality from HDF5) .
Attached screenshot.
Behaviour2: Corruption - We have found that in certain runs, we get these HDF5 files corrupted, we are assuming that maybe a concurrent write/create might be the problem.
Corrupted files cannot be opened. Any insights into this ?
Questions:
------------------------
1. In what scenario does HDF interfaces automatically detect and create a soft link ?
None.
2. Have you come across any such weird occurences before ?
It does sound like you have a situation where the same file is being manipulated by multiple writers at the same time. Note that it's not [currently] enough to just flush the data from an open file when multiple writers are involved - the file must actually be closed by one writer before being opened by another writer.
Quincey
···
On Nov 26, 2008, at 5:08 PM, Anish Anto wrote:
Thanks
Anish<r22p3_corruptSPCH.png>----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.
----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.