H5Dget_chunk_info performance for many chunks?

wright · October 23, 2021, 4:57pm

Hello,

I tried to read chunk offsets and sizes for a file with a lot of chunks and it seemed to be slower than expected. The problem is illustrated in this gist. Is it traversing a list from the beginning for each read?

Is there a way some way to get all of the chunk file offsets + sizes ?

Thanks for your help!

Jon

wright · October 25, 2021, 6:37am

For context : I am trying to get multithreaded read-only access to hdf5 files and followed the suggestion in here to grab chunks for manual decoding:
https://dbkt.hdfgroup.org/original/2X/4/4257c1156390fdf328f0ce7940f13fef6c598aa1.pdf

Perhaps this is just a performance bug, should I open a github issue for it?

Thanks,

Jon

epourmal · October 25, 2021, 1:11pm

Could you please try H5Dchunk_iter function? It is in the HDF5 develop branch.

wright · October 25, 2021, 6:56pm

Thanks! That looks like it is going to solve the problem.

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

H5Dget_chunk_info performance for many chunks?