HSDS + xarray + h5netcdf

Hey!

I am currently trying to upgrade dependencies in our project as numpy 1.* nears EOL.

Currently we are on and want to migrate to:

  • xarray==2022.12.0 → 2025.7.0
  • h5netcdf==1.3.0 → 1.6.4
  • h5py==3.10.0 →
  • h5pyd==0.14.1 → 0.23.0

What worked in the earlier versions was using xarray open_dataset directly (as long as the right environment variables with token/credentials for HSDS access was set) like this:

Blockquote
import xarray as xr
ds = xr.open_dataset(f"{hsds_sever_url}/{hsds_domain}", engine=“h5netcdf”)

But after upgrading this throws an FileNotFoundError(url) raised from fsspec.implementations.http. I have tried debugging a little bit, it seems to be related that fsspec tries to read info from the dataset calling the url input to xr.open_dataset, which leads to a 404 on HSDS (as the correct call would be /domains?domain={hsds_domain}). Previously there must have been a parsing layer either in the h5netcdf Store or elsewhere that understood this that now is removed. I suspect this was h5pyd that did the parsing using the File interface, but I have not had the time to go through the release histories to find exactly how or where it happened.

Anyways, I realize this has nothing to do with h5pyd or hsds directly, but if anyone else are using HSDS + xarray + h5netcdf and know of an easy fix to make the code example above work with the latest versions, please let me know :slight_smile:

Does the traceback have any mention of h5netcdf or h5pyd? Based on your explanation, it seems likely the problem is in xarray.

@simhav Have you tried using hdf5:// instead of http://? This should work if an HSDS endpoint env. variable or the config file setting is set. Perhaps xarray will allow the first form to pass all the way to h5netcdf/h5pyd.

Also took me a while to figure it out. I needed to specify the driver:

hsds_path = "/home/admin/my_datasets.h5"
ds = xr.open_dataset(hsds_path, engine="h5netcdf", driver='h5pyd')

this requires to have your credentials set in the h5pyd cli (hsconfigure command)

1 Like

Aha! This did it! Thank you so much :slight_smile:

Yes, the problem was the way xarray handled opening by default in the newer version! I did not know about the “driver” kwarg which is what makes it pass all the way down to use h5netcdf/h5pyd.