Retrieving dataset handle from filter callbacks

Greetings!

I’m working on the creation of a HDF5 filter that needs access to arbitrary metadata from the underlying file. The filter API won’t let me do that, however; the callbacks only provide access to the dataset’s property list, datatype, and dataspace:

htri_t (*H5Z_can_apply_func_t)(hid_t dcpl_id, hid_t type_id, hid_t space_id);
herr_t (*H5Z_set_local_func_t)(hid_t dcpl_id, hid_t type_id, hid_t space_id);

I have been also working on another filter that, for performance reasons, needs to allocate memory once and reuse it across the various calls to the filter. While I can allocate that memory on set_local(), the filter never knows when it’s safe to deallocate that, as the current filter design does not include a teardown callback.

I have a working patchset that introduces two new optional callbacks: init() and teardown(). In the first, the file id that’s embedded in the pipeline object (H5O_shared_t) is provided to the user, along with the three other well known hid_t objects. The signature of the latter is the same as set_local()/can_apply(), and it’s called on H5D_close().

My first question to you is: is this the right place to discuss API changes? The second question is: are you willing to accept such modifications? Last, my understanding is that we’d need a new H5Z_class3_t structure to prevent breaking existing applications. If that’s the case, then we could probably have a new version of set_local() as well which simply takes an extra File handle object, as opposed to introducing one more callback.

Thank you in advance for your attention and guidance.
Lucas

Hi Lucas,
This is a good place to talk about it. :slight_smile: And, yes, I believe that you will need additional callbacks, in a v3 of the I/O filter class. What do you need the file ID for? This, and your other suggestion about init / teardown callbacks is similar I’ve heard from other people, and I can work with you to refine an update to the class struct, if you’d like.

Quincey

Hi Quincey – thanks for the quick response!

The file ID is useful so I can retrieve data from other datasets and combine them in different ways through user-provided scripts. Note that this is different from HDF5 virtual datasets, which basically let one create mosaics from a collection of datasets’ slices.

I already have a working patchset, but it modifies the existing v2 structure. I’ll update it to use v3 and then I’ll point you to the GitHub patch so you can take a look. Thanks for volunteering! :slight_smile:

OK, great, looking forward to it. If you would like to contact me directly, my email is: koziol -at- lbl.gov.

Quincey
1 Like

Hello there!

Here’s a first draft of the patchset:

Please note that this version which adds H5Z_class3_t has not been properly tested yet – I’ve simply pushed it to the repository so you could give some early advice on the overall structural changes.

In particular, I’d like to hear your feedback on the following:

  • The new original_version member of H5Z_class3_t: since H5Zregister needs to deal with deprecated symbols and the new structure contains an enhanced API for set_local, we need a way to tell which version of that function the filter implements. What I dislike about it is that the initialization of filters becomes somewhat redundant for the casual reader (see my changes to c++/test/dsets.cpp for an example)

  • The file id is being extracted from the pipeline’s sh_loc.file object (please refer to the changes made to H5Z_prelude_callback()). Is that the right way to do it?

I have thought about replacing the new original_version member with an init callback, but I was discouraged because init and set_local would be semantically similar – and we would need to keep both in H5Z_class3_t for backwards compatibility purposes.

Thoughts?
Lucas

Greetings, Quincey et al

The questions I have raised on my post above are no longer relevant, as I have either worked around the problem or learned more about management of objects in HDF5.

Here is the most recent version of the code. It’s been tested to a good extent.

Looking forward to hearing your feedback.

Thanks!
Lucas