HDF5 file format: Is attribute info message required?

I try to implement a simple HDF5 file writer and right now I am able to create groups and attributes, i.e. h5dump is able to correctly dump the file contents and the C lib is also able to read the contents. My problem is with HDFView which says that there are no attributes stored on the group. I suspected that HDFView needs not only the Attribute Message but also the Attribute Info Message to be present for compactly stored attributes although the spec says “This message stores information about the attributes on an object, such as the maximum creation index for the attributes created and the location of the attribute storage when the attributes are stored “densely”.” My understanding here is that compactly stored attributes do not need an additional attribute info message.

However, when I dig into the source code for HDF5 1.10.7, the function H5O_attr_count_real is being called to get the number of attributes on the object (this function does not seem to exist on HDF5 1.14), and eventually this function calls H5A__get_ainfo:

htri_t
H5A__get_ainfo(H5F_t *f, H5O_t *oh, H5O_ainfo_t *ainfo)
{
...

    /* Check if the "attribute info" message exists */
    if((ret_value = H5O_msg_exists_oh(oh, H5O_AINFO_ID)) < 0)
	    HGOTO_ERROR(H5E_ATTR, H5E_NOTFOUND, FAIL, "unable to check object header")
    if(ret_value > 0) {
        /* Retrieve the "attribute info" structure */
        if(NULL == H5O_msg_read_oh(f, oh, H5O_AINFO_ID, ainfo))
	    HGOTO_ERROR(H5E_ATTR, H5E_CANTGET, FAIL, "can't read AINFO message")

        /* Check if we don't know how many attributes there are */
        if(ainfo->nattrs == HSIZET_MAX) {
            /* Check if we are using "dense" attribute storage */
            if(H5F_addr_defined(ainfo->fheap_addr)) {
                ...
            } /* end if */
            else
                /* Retrieve # of attributes from object header */
                ainfo->nattrs = oh->attr_msgs_seen;
        } /* end if */
    } /* end if */
...
} /* end H5A__get_ainfo() */

This method first tries to find the attribute info message (H5O_AINFO_ID) and when it is present it checks if the dense attribute storage is used. If not, the number of attributes found on the group is set (ainfo->nattrs = oh->attr_msgs_seen).

Since h5dump and using the C lib to access a specific attribute are working fine, I am not sure what is now the correct implementation: Do we need an attribute info message when the attributes are stored compactly?

The presence of an Attribute Info Message with all addresses (Fractal Heap Address, BTree2 Name Index Address and BTree2 Creation Order Index Address) set to the undefined address helps HDFView to find the attributes.

It is still unclear to me what is the correct approach since the spec does not say that these addresses are allowed to be set to the undefined address. I.e. the description for the Attribute Info Message’s Fractal Heap Address is:

This is the address of the fractal heap to store dense attributes. Each attribute stored in the fractal heap is described by the Attribute Message.

Compared to that of the Link Info Message:

This is the address of the fractal heap to store dense links. Each link stored in the fractal heap is stored as a Link Message. If there are no links in the group, or the group’s links are stored “compactly” (as object header messages), this value will be the undefined address.

I hope you can clarify this a little bit. Thanks in advance :slight_smile:

Fantasic work, @apollo3zehn-h5 . I’ll let the experts chime in, but here are a few initial thoughts.

That’s mighty odd, because HDFView uses the C-library to “figure things out.” Which version of HDFView are you using? Did you try H5Web? If you click the Inspect button, it’ll show attributes.

Can you share the files with us?

What are the superblock, object header, and attribute versions that you’re creating?

I believe an Attribute Info Message is relevant only for attribute version 3, which was introduced with HDF5 1.8.x to overcome the ~64K size limitation.

Yes, attribute versions 1 and 2 don’t have an Attribute Info Message.

H5O__attr_count_real is still there. Two underscores now.

G.

Hi Gerd,

please find my answers below:

3.3.0

It works fine with H5Web, attributes are being listed. I think the problem is that HDFView would also be able to list the attributes but it does so only when it detects any attributes. And this is not the case because internally it calls into the C library into function H5A__get_ainfo which ignores all attributes when it does not find an attribute info message.

  • Superblock: Version 3
  • Object Header: Version 2
  • Attribute Message: Version 3

I have attached both files, one with the attribute info message and one without it.

without-attr-info-message.h5 (678 Bytes)
with-attr-info-message.h5 (722 Bytes)

1 Like

One thing that should be verified, is if there is an issue between the public API calls and the internal functions. Because HDFView can only use the public APIs - it has no access to internal functions.

A quick investigation into HDFView indicates there is a problem in the java call stack.
h5dump should be the standard - if it has an issue then it is more likely a library issue.
HDFView does some things differently for visual display reasons. We will work on identifying the actual problem.

Please let me know if I can do anything to support the investigation. I did not check how h5dump accesses the attributes but for sure there must be some difference to how HDFView accesses them.

h5dump always worked fine for me, only HDFView works unexpected in that regard.

HDFView may be using a different process because of the need to store info about objects for later use. h5dump just dumps and forgets (except for checking for name cycles).
It does look like HDFView is sometimes using H5Gget_info call and getting the number of attributtes from the H5G_info_t. Sometimes from H5Oget_info and the H5Oino_t.

So logging my debugging, the H5A__get_ainfo call is executed (and will fail if there is no attr info msg) whenever: if (oh->version > H5O_VERSION_1). the comment on one instance of this check;
/* Attributes are only stored in fractal heap & indexed w/v2 B-tree in later versions */

Trying to chase down the possible stack;
H5O_attr_iterate_real and H5O__attr_exists calls:
oh = H5O_protect(loc, H5AC__READ_ONLY_FLAG, FALSE)
H5O__attr_count_real(H5F_t *f, H5O_t *oh, hsize_t *nattrs) passes it in.

H5O__attr_exists is called within H5VL__native_attr_specific function

H5O_attr_iterate_real is called from H5O__attr_iterate, which called within: H5A__iterate_common, which is called by H5A__iterate, and the public API is H5Aiterate.

One thing I did find is that the tools, i.e. h5dump actually use a function in the tools that actually use
H5Aiterate_by_name to get the objects in a file. In the tools/lib/h5trav.c file; trav_print_visit_obj function. I can’t find it now but I think that may use a path different then the check for the info message header.

Not sure yet how HDFView actually calls which function to get the attributes; but I suspect just H5Aiterate.

I believe HDFView uses H5Literate2 on objects.

This an unexpected statement for me; maybe this does only mean that the C-lib always uses the fractal heap and not that this is a must. In the HDF5 spec 3.0 I do not see anything that prohibits compact attribute storage for v2 object headers.

The function containing that comment is itself commented with “for 1.8 attributes”, so maybe this method/comment is not relevant for our problem.

I think it firsts checks if there are any attributes before it iterates them and to know if any attributes are present it calls H5Oget_info() (https://github.com/HDFGroup/hdfview/blob/e9429b3ef1a5fd7f1f19804e17f1e02eaefcc34c/src/org.hdfgroup.object/hdf/object/h5/H5Group.java#L201) which returns 0 attributes. But that is just a guess.

Definitely, HDFView starts with calling H5Oget_info, which I think ends up H5O_get_info function which calls: oh = H5O_protect(loc, H5AC__READ_ONLY_FLAG, FALSE), and then;

/* Retrieve # of attributes */
if (fields & H5O_INFO_NUM_ATTRS)
    if (H5O__attr_count_real(loc->file, oh, &oinfo->num_attrs) < 0)
        HGOTO_ERROR(H5E_OHDR, H5E_CANTGET, FAIL, "can't retrieve attribute count")