Memory management in H5FD_class_t

it feels like I/O filter interface should also provide a capabilities interface to let the core HDF5 routines know if it’s OK to provide a buffer allocated by the VFD as input to an I/O filter, or if a data copy via H5FD_CTL__MEM_COPY is needed.

I think heading in this direction makes a lot of sense and is a good next step for opening up the HDF5 ecosystem for use with this type of hardware. It does seem as though the I/O filter interface may need to have better provisions for a filter to be able to inform HDF5 about the types of buffers it can handle (as well as inform the library about the types of buffers it might hand back). And as you noted in another thread, the I/O filter interface doesn’t currently give you the file handle, so one can’t make an HDF5 call to generically request memory management. I think this would be something nice to include if the I/O filter interface is to be revised.

  1. introduce a variable such as io_info->using_gpu_vfd
  2. initialize H5D_layout_ops par_read member so it points to H5D__chunk_gpu_read()
  3. let H5D__chunk_gpu_read() read the compressed data via VFD
  4. let H5D__chunk_gpu_read() prepare an iovec-like array with the offset and length of each chunk
  5. let H5D__chunk_gpu_read() call into the I/O filter while providing the iovec as input (would need a change to the H5Z_class2_t APIs – possibly bumping to H5Z_class3_t)

I feel like the approach to something like this might become more obvious once the basic infrastructure to support GPU I/O filters is in place, but I wonder if this is perhaps re-inventing more of the wheel than is necessary. Since the reads for chunk data go down to the file driver layer, it seems you should be able to re-use HDF5’s existing chunking support for accomplishing this, without needing to worry about creating your own write/read routines. Of course there will most likely need to be some refactoring to deal with places where the library assumes buffers are allocated with malloc, support for the changes described for the H5Z_class_t, and so on, but at least in theory it should be relatively straightforward to leverage HDF5’s existing chunking code.

One more thing on this topic: I remembered that H5FDctl is not really an application-level routine in HDF5 at the moment and was more of library-internal support for the GDS VFD. I think it would make sense for us to repurpose H5allocate(resize/free)_memory for VFD-level memory management within HDF5 applications, rather than those routines simply using malloc semantics. There may be some application compatibility concerns there, though.

Certainly a lot to think about here!

1 Like