h5serve and h5pyd via SSH access

I’m very new to h5server and hsds, and had a question about some setup issues I’m having.

To set the scene, I have some a large data files that I have on a cluster which I have ssh access. I want to querry them remotely, so I can do analysis on bits of the data without downloading the whole file.

On the sever I have gotten h5serv running and listening on port 5000 and have configured ssh locally with a local forward for port 5000. On my local machine I then have a simple h5pyd scipt running that does the following:

import h5pyd as h5py

if __name__ == '__main__':
    DOMAIN_PATH="/some/dir/data.h5"
    f = h5py.File(DOMAIN_PATH, "r")

To run this I do the following and get this error:

$ export HS_USERNAME="my_user_name"
$ export HS_ENDPOINT="http://localhost:5000"
$ python server_test.py
Traceback (most recent call last):
  File "/venv/server_test.py", line 10, in <module>
    f = h5py.File(DOMAIN_PATH, "r")
  File "/venv/lib/python3.10/site-packages/h5pyd/_hl/files.py", line 296, in __init__
    raise IOError(rsp.status_code, rsp.reason)
OSError: [Errno 403] Forbidden

The error log from the server side is:

$ h5serv
Using logfile:  h5serv.log
password_uri config: util/admin/passwd.h5
Setting log level to: INFO
INFO:app.py:3312::log test
favicon_path: favicon.ico
Static content in the path:static will be displayed via the url: /views/(.*)
isdebug: True
domain: hdfgroup.org
ssl_port: 6050
Setting watchdog on:  data
INFO:passwordUtil.py:56::getAuthClient
INFO:passwordUtil.py:58::password_uri:util/admin/passwd.h5
INFO:authFile.py:33::AuthFile class init(util/admin/passwd.h5)
INFO:app.py:3364::INITIALIZING...
INFO:app.py:3365::Starting event loop on port: 5000
Starting event loop on port: 5000
INFO:app.py:291::getFilePath: /some/dir/data.h5 checkExists: True
INFO:app.py:293::tocFilePath: data/.toc.h5
host: /some/dir/data.h5 topdomain: hdfgroup.org
top-level domain is not valid
403 GET /?getdnids=1&getobjs=T&include_attrs=T&domain=%2Fsome%2Fdir%2Fdata.h5 (127.0.0.1) 3.06ms

From the error message the topdomain isn’t valid, but what I don’t understand is what to set it to and/or how to configure the DNS in order to solve this problem. Any help would be great, I’m a little out of my depth here.

Hi, h5serv has been deprecated (I just pushed a README update for this) and I’d recommend you use HSDS instead. Besides supporting many new features compared with h5serv, you should find the documentation a bit more helpful as well.

HSDS would be well suited for providing access to an existing repository of HDF5 files. I can give you some pointers for setting up access to your file collection, but first give the HSDS install a go and let me know if you have any issues getting it up and running.