Why the __db__ group?

Hi,

As the docs say, if I want a HDF5 served by h5serve to be readonly,

I set the file perms as such. If I set the permissions to read only
and put a new HDF5 file in the data/ dir it is detected and seems
to work fine. But if I don’t do that, H5serv writes some meta data
into the root group of my HDF5 files. Specifically ctime, mtime, and
a bunch of mappings from object ref ids to UUIDs. If H5serv works
without this stuff, I’m wondering how it works, and what the point
in having it is at all? I would prefer it wasn’t written there by
default. Is there a setting?

Regards,

Sam.
···


PGP :
0xA53455C1

Hi Sam,

  As you surmised, the db group is used to keep track of UUID's, create and modified times, and other features (like ACL's – Access Control List) that aren't currently modeled by the HDF5 library. For read-only files, this information is kept in a separate file which is just the filename with a '.' pre-appended. The first time h5serv accesses a file, it iterates through all the objects of the file and either creates the db group or (for read-only files), creates the dot file.

  We could have a setting that says: "always use the external file", but there's a slight problem with using the external file if h5serv will be modifying the content. When a dataset or group is created through the REST API it is initially anonymous, I.e. There's not a link to the object. In the REST API you typically do a POST request to create the new object, and then do a PUT to name the object using a link name. The server implementation uses a hidden link from the db group to keep the anonymous object alive until a link is created. This would be somewhat more difficult to do if the hidden link existed in an external file.

  Exporting the file and removing the db group is pretty trivial. In Python it would just be a statement like: del f['_db_']. This will delete all the h5serv specific info and just leave a normal HDF5 group. The only side-effect would be that any anonymous dataset or group objects would be deleted as well.

  John

···

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Sam Pinkus <sgpinkus@gmail.com<mailto:sgpinkus@gmail.com>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Friday, May 27, 2016 at 5:52 PM
To: "hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>" <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: [Hdf-forum] Why the __db__ group?

Hi,

As the docs say, if I want a HDF5 served by h5serve to be readonly, I set the file perms as such. If I set the permissions to read only and put a new HDF5 file in the data/ dir it is detected and seems to work fine. But if I don't do that, H5serv writes some meta data into the root group of my HDF5 files. Specifically ctime, mtime, and a bunch of mappings from object ref ids to UUIDs. If H5serv works without this stuff, I'm wondering how it works, and what the point in having it is at all? I would prefer it wasn't written there by default. Is there a setting?

Regards,

Sam.

--

PGP: 0xA53455C1