Hello Gerd,
Thanks for your answer. I'm going to explain my view of a HDF5 RESTful
API and why we are looking into it at the moment.
I know about the OPeNDAP project, but this is only focused on
publishing data. The idea of a RESTful API would be to extend this
capabilities and create a mapping between HDF5 CRUD operations
(create, retrieve, update, delete) into URLs. This is more or less
like treating HDF5 files as databases, which is quite a challenge in
terms of keeping the consistency of the files under a concurrent
access environment. (I think the next HDF5 version will implement
single write multiple read functionalities which will make this task
much easier and more efficient)
One of the first questions I would ask myself is: Why we want to wrap
HDF5, which is one of the most neat and efficient systems for data IO,
with a RESTful API, which is maybe the most slow and inefficient way
of data transmission? (for RESTful APIs it's a good practice to encode
all client/server data interchange in JSON)
Answer 1: Because it gives great visibility/usability to our data.
Making use of RESTful technologies we can easily create web
applications or high level APIs to play with the data. We can also
create some basic functionalities to append new data or manage
datasets.
Answer 2: This is completely silly. If you really want to do that,
storing your data in flat text files wouldn't make any difference in
performance. HDF5 files normally store raw data and nobody wants to
consume gigabytes of raw data that need to be transmitted and
processed on the client side.
For me, both answers are right, that's why my approach would be to
create some kind of high level RESTful API which adds more
intelligence on the server side, as subsetting, aggregation or
statistical processing. As this is highly dependent on the type of
data stored in the HDF5 files it would be a challenge to do something
generic that covers most cases. At the moment we are trying to come
out with a solution to implement all this in our project. So far,
python seems to be the way to go for us, and we are starting to
implement something mainly based on h5py, pandas and flask.
I hope I've been clear on my explanation, sorry about the length of
the post and any feedback on this topic will be highly appreciated.
Cheers,
Pablo
···
On 27 March 2014 23:27, Gerd Heber <gheber@hdfgroup.org> wrote:
Pablo, how are you? There's nothing we could share with you at the moment.
Have you read the specification? Do have any comments/concerns?
(We a currently revising the spec...)
A related effort is OPeNDAP (http://www.opendap.org/\).
Would you mind describing your use case?
Best, G.
-----Original Message-----
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Pablo Rozas Larraondo
Sent: Thursday, March 20, 2014 6:20 PM
To: hdf-forum
Subject: [Hdf-forum] HDF5 RESTful API
I've find out about the project on creating a RESTful API to interact with HDF5 from a question at Stack Overflow:
python - Pandas HDF5 as a Database - Stack Overflow
which point to this entry in this mail list from early 2013:
http://hdf-forum.184993.n3.nabble.com/RESTful-HDF5-td4025808.html
I would like to know what is the current state of the implementation or if there are other projects already implementing this idea.
Many thanks,
Pablo
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org