I am new to HDF5 and am building a HDF5 parser in C# for one of our projects. I am using the, “HDF5 File Format Specification Version 3.0”, document as the blueprint for the parser.
I have run into a problem that hopefully someone can shed some light on. When parsing the V2 B-tree Internal nodes and calculating the, “Number of Records in Child Node”, to find the size of the field, there is something of an equation that involves the fixed size overhead for the child node and, “one pointer triplet”. The problem I am having is that to get the unknown size is one of the pieces of the triplet.
Excuse not understanding the explanation, but it seems like a Catch-22 situation where to find the size of the field, you need to know the size of the field to get the size of the triplet overhead.
I agree that that description is confusing, and also not entirely accurate. From looking at the library code, the size of the “Number of Records in Child Node” field is constant across all nodes in the tree, and is set according to the maximum number of records that could fit in a leaf node, which does not contain “triplets", only records. I will file a documentation bug for this.
You can see the usage of this value at lines 695 and 817 in H5B2cache.c (all line numbers from develop), and you can see how it’s calculated at lines 144 (macro at 45), 145, 167 (inline function at H5VMprivate.h:500), and 168 in H5B2hdr.c.
The calculation for “Total Number of Records in Child Node” is more complex, but the description in the file format spec is mostly correct, except it is arguably vague about exactly how the iteration proceeds. To see this in the code you can look at the loop starting at line 173 in H5B2hdr.c and the macros at lines 112 and 48 in H5B2private.h.
Thank you very much for checking into this and for the clarification. Definitely appreciate the call out to the file name / line numbers in the code to see how it should be parsed. Thanks again.