Filter plugins and the library: to link or not to link?

Hello!

Since some 1.8.* release, HDF5 library provides memory management functions
https://support.hdfgroup.org/HDF5/doc/RM/RM_H5.html#Library-AllocateMemory
https://support.hdfgroup.org/HDF5/doc/RM/RM_H5.html#Library-FreeMemory

Originally, the latter was designed to let clients release string
buffers returned by some get-methods:

H5Tget_member_name provides an example of memory allocation on behalf
of the caller: The function returns a buffer containing the name of a
compound datatype member. It is the caller’s responsibility to
eventually free that buffer with H5free_memory.

Please note that on *nix systems, simple free() call does usually the
right thing; it is Windows systems with no uniform C runtime where
H5allocate_memory/H5free_memory become the life-savers. HDF5 memory
management functions remove the limitation that your client program is
built against the same C runtime as the HDF5 library.

The freedom from “the same runtime build” limitation is actually
important not only for client programs, but equally for 3rd-party
compression filter plugins. Again, this is most critical in proprietary
and heterogeneous Windows environments, where it is very hard or
impossible to simply build from source.

By design, HDF5 3rd-party filter plugins must be able to allocate and
free buffers compatibly with the library, i.e. using
H5allocate_memory/H5free_memory. However, in reality 3rd-party plugins
hesitate to use HDF5 memory management routines:

– most explicably, in order to avoid linking to HDF5 library at run-time.
Hmm, but some filter use simple malloc/free even though linking heavily
to other HDF5 routines!

Why run-time linking aspect is important? On Windows, one doesn’t
typically have centralized library storage, similar to /usr/lib; it is
not uncommon to put dependent libraries along-side the executable under
custom filenames like “hdf5-1.10.dll”, and LoadLibrary them by full path
name. But how 3rd-party filter (isolated from the client executable!)
could guess all that information at run-time? It is natural that filter
plugin authors avoid runtime-linking to HDF5 library, in order to
improve inter-operability.

So, the two requirements do come contradictory:

  1. Filter plugin should link to HDF5 library at runtime in order to use
    H5allocate_memory/H5free_memory properly, to ensure filter<->library
    compatibility.
  2. Filter plugin should avoid linking to HDF5 library at runtime,
    because it’s generally impossible (without some assistance from the
    client program) to find the library to link to.

One solution could be that HDF5 library passes
H5allocate_memory/H5free_memory functions to H5Z_func_t directly, but
that solves the problem only partly:
H5Pget_filter_by_id2/H5Tget_size/etc methods used by some advanced
filter plugins (GitHub - kiyo-masui/bitshuffle: Filter for improving compression of typed binary data.) are still
unavailable without runtime-linking to HDF5.

What is the way to go? Maybe I’m missing something?
How would you approach

or

so it works reliably on Windows?

Best wishes,
Andrey Paramonov

Users on all platforms, including linux must be careful with memory allocation and HDF5 library! And plugins have an extra layer of caution.

Plugins must be careful if they need to access the active HDF5 instance - this is why HDF5 library must be dynamically available for use by the HDF5 library and the plugin.

IF the plugin does not access state in the HDF5 library and does not implement set_local or can_apply, the plugin may get away with static linking.

NOTE that the HDF5 state of instance is the key.

Memory allocation is very complicated w.r.t buffers created by HDF5 and those created by applications. Plugins usage depends on the plugin author’s knowledge of intent.

Allen

Hello,

And thank you for your timely reply!

18.04.2018 17:05, byrn пишет:

Users on all platforms, including linux must be careful with memory
allocation and HDF5 library! And plugins have an extra layer of caution.

I read this as “3rd-party filter plugins should use HDF5 memory
allocation functions” – is my understanding correct?

Plugins must be careful if they need to access the active HDF5 instance

  • this is why HDF5 library must be dynamically available for use by the
    HDF5 library and the plugin.

I agree to be very careful :slight_smile: but please explain what are the
possible/suggested steps to get that “active HDF5 instance”.

IF the plugin does not access state in the HDF5 library and does not
implement set_local or can_apply, the plugin may get away with static
linking.

Yes, but at a cost of requiring “the same C runtime build”. On Windows
that’s very hard or impossible to achieve; the requirement is quite
limiting.

NOTE that the HDF5 state of instance is the key.

Memory allocation is very complicated w.r.t buffers created by HDF5 and
those created by applications. Plugins usage depends on the plugin
author’s knowledge of intent.

While on *nix it’s easiest to get the binary by building from the
source, on Windows it’s unreasonable to expect that all HDF5 library
versions installed are built against the same C runtime, have the same
filename etc.

I’m not sure about the measures of knowledge, but my intent for

is to have binary (64-bit) artifact, that could be dropped into user’s
HDF5_PLUGIN_PATH and thus transparently enable reading of
zstd-compressed HDF5 datasets, via HDFView, h5py etc. With this intent,
what is proposed solution, i.e. what is the suggested approach for

?

Thank you for your support,
Andrey Paramonov

Hello!

Another memory management problem in 3rd-party filter:

Best wishes,
Andrey Paramonov

18.04.2018 17:55, Andrey Paramonov пишет: