Computational storage with HDF5-UDF



I’m happy to announce the availability of a new experimental backend for HDF5-UDF that lets one populate dataset values using CUDA kernels. Moreover, if the user-defined function happens to take input from other datasets from the HDF5 file, such dependencies are DMA-transferred from disk to the GPU memory using NVIDIA’s GPUDirect Storage.

Here’s a screenshot that gives you an idea of how to use this backend. Note how simple it is to invoke the kernel: the data retrieved with lib.getData() is allocated in GPU memory, so it’s readily available to the CUDA kernel. HDF5-UDF takes care of copying the results from device memory to the host, too, so no explicit calls to NVIDIA APIs are needed to get started.

A current limitation of this implementation is that DMA transfers are only possible if dependencies have a contiguous layout on disk. It would be nice if we had an API such as H5Dget_chunk_offsets(hid_t dset_id) which provided us with the extents where the dataset chunks are stored. If we had that, then we could both DMA-transfer chunked datasets and decompress them in the GPU itself.

Please visit the project’s GPUDirect Storage branch if you’re interested in testing this feature.

Have fun!


Hi Lucas,

Thank you for the nice new feature! Are you looking for
function? It provides chunk address in the file.



H5D_GET_CHUNK_INFO retrieves the offset coordinates offset, filter mask filter_mask, size size and address addr for the dataset specified by the identifier dset_id and the chunk specified by the index index.The chunk belongs to a set of chunks in the selection specified by fspace_id.If the queried chunk does not exist in the file, the size will be set to 0 and address to HADDR_UNDEF.


Oh, this is brand new information for me – I was unaware of this new API! Yes, H5D_GET_CHUNK_INFO should definitely do it. I will look into incorporating support for chunked datasets soon.

Thanks for the pointer!