Indexing for frequent updates?

Hi all,

I have a question about indexing method of HDF5.

(1) As far as I read from the archives, it seems there was some effort
going on to build a production level indexing several years ago. Can anyone
tell me what the status is now?

(2) Both FastQuery and PyTable seem to gear toward the append-only and
read-only purpose. If an application demands a lot of updates/deletes is
there any good indexing I can use? Or this scenario is too rare to be
considered in real world?

Thanks very much.

Best.

···

--

Best,

Jun Yuan-Murray

-------------------------------------------------------------
PhD Candidate, CS Dept, SPLAT
Stony Brook University

Hi Jun,

We have been working for some months now on integrating both FastBit and
ALACRITY indexing libraries into HDF5. We have also defined a new
indexing API for HDF5 that will allow you to build indexes on specific
datasets. There will soon be more details provided to this mailing list.

For now if an application demands a lot of updates and modifies values,
the index attached to the dataset must be rebuilt entirely. This is for
now due to limitations within the FastBit/ALACRITY index packages that
do not support incremental updates, although the indexing API that we
defined supports it. So this scenario is not the best supported scenario
yet, but will be supported in the future.

Thanks

Jerome

···

On Wed, 2014-12-03 at 12:40 -0500, Jun Yuan-Murray wrote:

Hi all,

I have a question about indexing method of HDF5.

(1) As far as I read from the archives, it seems there was some effort
going on to build a production level indexing several years ago. Can
anyone tell me what the status is now?

(2) Both FastQuery and PyTable seem to gear toward the append-only and
read-only purpose. If an application demands a lot of updates/deletes
is there any good indexing I can use? Or this scenario is too rare to
be considered in real world?

Thanks very much.

Best.

--

Best,

Jun Yuan-Murray

-------------------------------------------------------------
PhD Candidate, CS Dept, SPLAT
Stony Brook University
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Hi Jerome,

  this is interesting to hear, I tried to use FastBit some time ago but had some portability
issues as I could not get it to compile under Windows / MinGW, but it's a very interesting
library.

  When would you expect an release of this integration and would an early version be
available for testing?

        Werner

···

On 08.12.2014 17:58, Jerome Soumagne wrote:

Hi Jun,

We have been working for some months now on integrating both FastBit and
ALACRITY indexing libraries into HDF5. We have also defined a new
indexing API for HDF5 that will allow you to build indexes on specific
datasets. There will soon be more details provided to this mailing list.

For now if an application demands a lot of updates and modifies values,
the index attached to the dataset must be rebuilt entirely. This is for
now due to limitations within the FastBit/ALACRITY index packages that
do not support incremental updates, although the indexing API that we
defined supports it. So this scenario is not the best supported scenario
yet, but will be supported in the future.

Thanks

Jerome

On Wed, 2014-12-03 at 12:40 -0500, Jun Yuan-Murray wrote:

Hi all,

I have a question about indexing method of HDF5.

(1) As far as I read from the archives, it seems there was some effort
going on to build a production level indexing several years ago. Can
anyone tell me what the status is now?

(2) Both FastQuery and PyTable seem to gear toward the append-only and
read-only purpose. If an application demands a lot of updates/deletes
is there any good indexing I can use? Or this scenario is too rare to
be considered in real world?

Thanks very much.

Best.

--

Best,

Jun Yuan-Murray

-------------------------------------------------------------
PhD Candidate, CS Dept, SPLAT
Stony Brook University
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Center for Computation & Technology at Louisiana State University (CCT/LSU)
2019 Digital Media Center, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Hi Werner,

We are still working on defining the way we want to release it so I
can't really say anything yet, but we'll try to have it released
incrementally so that you can benefit from new features as early as
possible.

I can point you to my branch on github (based on HDF5 trunk):

You can have a look at test/index.c or test/query.c

To enable FastBit support, turn on HDF5_ENABLE_FASTBIT_SUPPORT for CMake
or pass the --with-fastbit option if you use configure.

We'll also soon distribute the corresponding RFC.

Jerome

···

On Mon, 2014-12-08 at 19:33 +0100, Werner Benger wrote:

Hi Jerome,

  this is interesting to hear, I tried to use FastBit some time ago but
had some portability
issues as I could not get it to compile under Windows / MinGW, but it's
a very interesting
library.

  When would you expect an release of this integration and would an
early version be
available for testing?

        Werner

On 08.12.2014 17:58, Jerome Soumagne wrote:
> Hi Jun,
>
> We have been working for some months now on integrating both FastBit and
> ALACRITY indexing libraries into HDF5. We have also defined a new
> indexing API for HDF5 that will allow you to build indexes on specific
> datasets. There will soon be more details provided to this mailing list.
>
> For now if an application demands a lot of updates and modifies values,
> the index attached to the dataset must be rebuilt entirely. This is for
> now due to limitations within the FastBit/ALACRITY index packages that
> do not support incremental updates, although the indexing API that we
> defined supports it. So this scenario is not the best supported scenario
> yet, but will be supported in the future.
>
> Thanks
>
> Jerome
>
>
> On Wed, 2014-12-03 at 12:40 -0500, Jun Yuan-Murray wrote:
>> Hi all,
>>
>>
>> I have a question about indexing method of HDF5.
>>
>> (1) As far as I read from the archives, it seems there was some effort
>> going on to build a production level indexing several years ago. Can
>> anyone tell me what the status is now?
>>
>>
>> (2) Both FastQuery and PyTable seem to gear toward the append-only and
>> read-only purpose. If an application demands a lot of updates/deletes
>> is there any good indexing I can use? Or this scenario is too rare to
>> be considered in real world?
>>
>>
>> Thanks very much.
>>
>>
>> Best.
>>
>>
>>
>> --
>>
>> Best,
>>
>> Jun Yuan-Murray
>>
>> -------------------------------------------------------------
>> PhD Candidate, CS Dept, SPLAT
>> Stony Brook University
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> Hdf-forum@lists.hdfgroup.org
>> http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>> Twitter: https://twitter.com/hdf5
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@lists.hdfgroup.org
> http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5