Can we use Hdf5 server API( h5serv) to obtain dataset using dataset name


#1

Hi all,
I am trying to use H5serv api to access datasets from a server. Assuming my h5 file with name hdfdata.h5 is stored at hdf folder at example.com. As per h5serv documentation the access domain for this file will be hdfdata.hdf.example.com
Now to access dataset in this file first I have to send a GET request with dataset something like get("http://127.0.0.1:5000/datasets?host=hdfdata.hdf.example.com") this provides me with the uuid of all the dataset in this File.
Output is something like
'{"hrefs": [{"href": "http://127.0.0.1:5000/datasets?host=h5data.icmplus.neurosurg.cam.ac.uk", "rel": "self"}, {"href": "http://127.0.0.1:5000/groups/7c57b654-232d-11e8-bc5e-4cbb5813b2f6?host=h5data.icmplus.neurosurg.cam.ac.uk", "rel": "root"}, {"href": "http://127.0.0.1:5000/?host=h5data.icmplus.neurosurg.cam.ac.uk", "rel": "home"}], "datasets": ["7c57b655-232d-11e8-bc5e-4cbb5813b2f6", "7c57b656-232d-11e8-bc5e-4cbb5813b2f6"]}'
Then to access values in a dataset I have use that uuid like get("http://127.0.0.1:5000/datasets/7c57b655-232d-11e8-bc5e-4cbb5813b2f6/value?host=hdfdata.hdf.example.com")
So what I want to know is that if there is a way to access this data directly using the group and dataset names that we give when creating hdf file. For ex if there is dataset ds in group gp in our hdf file can we access it directly using someting like
get("http://127.0.0.1:5000/gp/ds/value?host=hdfdata.hdf.example.com")

I want to know this because currently when uuids are listed using get dataset request it does not tell which uuid corresponds to which dataset name and for that we always need a mapping between uuid and dataset name so that we can use it easily.


#2

Hi,
Thanks for your interest in h5serv!

If you have a specific path in mind, you can start with the Root group, get the id for the link in the first element of the path. The link response will give you either the uuid for the dataset (if that’s the end of the path) or a subgroup (in which case you continue with the next element in the path). As an example, see testGetHard in the test case: https://github.com/HDFGroup/h5serv/blob/develop/test/integ/linktest.py.

This can take a few round trips to the server for long paths, so for HSDS (our new S3-based data service), we’ve extended the api so you can also do requests like this: GET /datasets?host="hdfdata.hdf.example.com&h5path="/g1/g1.1/dset1.1.1". I haven’t back-ported this to h5serv yet though.

And finally, if you are using Python, try out h5pyd (https://github.com/HDFGroup/h5pyd). This package will let you fetch the dataset just as you would with h5py, e.g. f["/g1/g1.1/dset1.1.1"].

Let me know if this helps!
John


#3

Hi,
Thanks for this useful comments.

I had setup my HDF server in my local workstation and I can access my .h5 files from REST API clients by the method as above. However, I am having difficulty while accessing .h5 fils from python code using h5pyd. I can access the file, print its shape, datatypes etc. But When I try to get the values of dataset in the .h5 files, the contents are all zeros. I don’t know what wrong I am doing here.

If I access the same .h5 file locally from python code using h5py, everything works fine. But when I use h5pyd to access the .h5 file from server, all the contents are zero.

Please help


#4

I had setup my HDF server in my local workstation and I can access my .h5 files from
REST API clients by the method as above. However, I am having difficulty while
accessing .h5 fils from python code using h5pyd. I can access the file, print its shape,
datatypes etc. But When I try to get the values of dataset in the .h5 files, the contents
are all zeros. I don’t know what wrong I am doing here. If I access the same .h5 file
locally from python code using h5py, everything works fine. But when I use h5pyd to
access the .h5 file from server, all the contents are zero.

Here’s something to try to determine if the root cause if your system / environment / setup: Just create a free trial account on HDF Kita Lab (https://www.hdfgroup.org/solutions/hdf-kita/), upload your HDF5 data, and then try using the same h5pyd commands on Kita to compare against to your local results.

– dave


#5

Hi, Thank You for your suggestions. I tried to implement your suggestion and I got the solution. Actually, I was trying to use .get command to access the values of dataset instead of ellipsis operator command ([…]) to access the values.
I corrected my mistake and I am getting the required values.

Thanks again