Hello Gerd,
Thank you very much for the code sample.
The code sample in C language is perfect solution to express your ideas.
I integrated your idea into my code-sample (Sent to HDF5 Techsupport on 6/16/2021.
Would you contact Binh-Minh for details?) and executed tests.
I see the write performance degradation 100 times for 5K network packets collected in HDF5 database.
Let me answer your questions
"OK, I don’t fully understand what you are saying, but my guess is that you are acquiring some kind of packet stream. The choices you’ve made are:
- The stream is represented as an HDF5 group.
- Stream properties/packet invariants are represented as HDF5 attributes
- Individual packets are represented as HDF5 ???
What about 3.? How do you represent packets?"
=LV=
I wish you had an access to my code-sample to address your questions.
-
A network stream is stored under HDF5 group. Each network packet has length 1024 KB
-
Each individual packet is stored as name-value pair in HDF5 attribute.
-
Each new HDF5 attribute has the following parameters:
“You haven’t told us how you are reading/accessing your stream. How do you maintain/represent packet order?”
=LV= I don’t see HDF5 performance degradation for search and read operations for HDF5 attribute.
This is a list of HDF5 calls for read operation where sAttributeName is the network packet number (see above)
h5_ADattr = H5Aopen_name(H5Group, sAttributeName);
nReturn = H5Aread(h5_ADattr, h5_ADtype, &stringRead);
nReturn = H5Aclose(h5_ADattr);
“I believe you told us that the packet type is a fixed-size string, right?”
=LV= Correct
“I don’t understand your comment on large numbers of records in HDF5 group tables. Groups contain links, not records. If you mean ‘attributes’ by records, then what you are seeing in HDFView is just a replay of the performance problem you are reporting.”
=LV= Created a code-sample where H5Group has H5PT inserted. Stored 1M entries in H5PT where each entry is ~1024 KB string. Open HDF5 database with HDFView v2.14.0 utility. Select H5PT table from H5Group. Observe memory error dialog.
Thank you very much for your help to address HDF5 performance degradation problem during H5Awrite operations,
Leon

gheber
Gerd Heber
The HDF Group Staff
June 23
leon_vernikov:
- HDF5 attribute base-element is simplest and basic property of HDF5 database to implement.
- HDF5 attribute base-element/API allows to store both network packets and information related to network traffic
(type of network traffic, the traffic property related to all network packets).
- HDF5 attribute base-element/API doesn’t require a separate table to be inserted under HDF5 group.
- Most important: HDFView v2.14.0 has memory problem to open HDF5 group’s table with large number of records. Basically, HDFView is dead at opening.
OK, I don’t fully understand what you are saying, but my guess is that you are acquiring some kind of packet stream. The choices you’ve made are:
- The stream is represented as an HDF5 group.
- Stream properties/packet invariants are represented as HDF5 attributes
- Individual packets are represented as HDF5 ???
What about 3.? How do you represent packets?
I don’t understand your comment on large numbers of records in HDF5 group tables. Groups contain links, not records. If you mean ‘attributes’ by records, then what you are seeing in HDFView is just a replay of the performance problem you are reporting. If you mean ‘links’ by records, then, again, there are low-level tricks to massage the underlying B-tree and heap, but it won’t do miracles for huge (100K, 1M,…) numbers of links.
You haven’t told us how you are reading/accessing your stream. How do you maintain/represent packet order?
I believe you told us that the packet type is a fixed-size string, right?
leon_vernikov:
=LV= Are you referring H5PT base-element?
Not as an implementation, because that one’s done poorly, but as a general idea. For a proper implementation, take a look at @steven 's packet table implementation in H5CPP (example).
G.