Read/write specific coordiantes in multi-dimensional dataset?


Does anyone have a use case where it would be useful to read or write data at a list of individual coordinates in a multi-dimensional dataset?

h5py has the low-level machinery to support this, but currently the only documented way to use it is to select data with a boolean mask array. If your dataset is large, making a boolean array of the same shape in memory may be a problem. I made a PR some months ago to add a nicer API, but I don’t have a need for it myself, so I haven’t followed it up.

So if someone would like this, please speak up, and please comment on whether the high-level API I proposed for it would work for your use case. If we don’t hear from anyone, it will likely be closed in a few more months as not useful.

Other things that are already possible from h5py:

  • Reading & writing points in a 1D dataset - this works like NumPy ‘fancy indexing’: index a dataset with an array/list of numbers.
  • Reading & writing points along a line in a mutidimensional dataset - you can use ‘fancy indexing’ with one dimension, and regular indexing with the others.
  • Reading & writing arbitrary points one at a time - write a loop over your coordinates. Of course, this will be comparatively slow.