Access to user block on open file?

Hi,

Does anyone know if it's possible to read/modify the user block
associated with an open HDF5 file? The documentation for
H5Fget_vfd_handle says not to use the returned handle to modify the
file... is it safe to do things like seeking and reading?

Thanks,
Andrew Collette

Hmm. I know at least some of the VFD's in HDF5 maintain 'internal'
knowledge of the file's current offset apart from the underlying FILE*
stream or int fd desriptor to which they refer. So, if you did get in
there and do a seek or read operation, you'd need to make sure you
returned the FILE* stream of int fd to the state it was in BEFORE you
did anything. Otherwise, HDF5 and the underlying file object wouldn't
agree about where they are pointing.

Out of curiosity, what does get_vfd_handle return for the core vfd? Does
it return the actual buffer the core vfd is writing to? Is it even
implemented for core?

Mark

···

On Fri, 2010-03-26 at 13:26, Andrew Collette wrote:

Hi,

Does anyone know if it's possible to read/modify the user block
associated with an open HDF5 file? The documentation for
H5Fget_vfd_handle says not to use the returned handle to modify the
file... is it safe to do things like seeking and reading?

Thanks,
Andrew Collette

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851

Hello,

I am new to HD5.

I am trying to create many many number of groups ; however, the below piece of code hangs,

  chr1_id = H5Gcreate(file_id, "/chr1", H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
  for (i = 1; i <= 247249719; i++){
    k = sprintf(pos, "%d", i);
    group_id = H5Gcreate(chr1_id, pos, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
    status = H5Gclose(group_id);
  }

whenever i attempt to create, say, 1000 groups, it works just fine..
is there a maximum number of groups that an H5 structure supports?

am I doing some wrong?

Thanks,
-Paul

Hi Andrew,

Hi,

Does anyone know if it's possible to read/modify the user block
associated with an open HDF5 file?

  Yes, this section of the file is not modified by the HDF5 library. We've thought about adding some API routines for reading & writing to it, but they haven't come to the top of the queue yet.

The documentation for
H5Fget_vfd_handle says not to use the returned handle to modify the
file... is it safe to do things like seeking and reading?

  Yes, it should be.

    Quincey

···

On Mar 26, 2010, at 3:26 PM, Andrew Collette wrote:

Hi Mark,

Hmm. I know at least some of the VFD's in HDF5 maintain 'internal'
knowledge of the file's current offset apart from the underlying FILE*
stream or int fd desriptor to which they refer. So, if you did get in
there and do a seek or read operation, you'd need to make sure you
returned the FILE* stream of int fd to the state it was in BEFORE you
did anything. Otherwise, HDF5 and the underlying file object wouldn't
agree about where they are pointing.

  Yes, that's true and falls under the "don't modify the file/file handle" caveat, although we should probably make that more explicit.

Out of curiosity, what does get_vfd_handle return for the core vfd? Does
it return the actual buffer the core vfd is writing to? Is it even
implemented for core?

  Yes, it's implemented for the core VFD and returns the pointer to the buffer.

  Quincey

···

On Mar 26, 2010, at 5:21 PM, Mark Miller wrote:

Mark

On Fri, 2010-03-26 at 13:26, Andrew Collette wrote:

Hi,

Does anyone know if it's possible to read/modify the user block
associated with an open HDF5 file? The documentation for
H5Fget_vfd_handle says not to use the returned handle to modify the
file... is it safe to do things like seeking and reading?

Thanks,
Andrew Collette

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Paul,

···

On Mar 26, 2010, at 5:24 PM, Paul Zumbo wrote:

Hello,

I am new to HD5.

I am trying to create many many number of groups ; however, the below piece of code hangs,

  chr1_id = H5Gcreate(file_id, "/chr1", H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
  for (i = 1; i <= 247249719; i++){
    k = sprintf(pos, "%d", i);
    group_id = H5Gcreate(chr1_id, pos, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
    status = H5Gclose(group_id);
  }

whenever i attempt to create, say, 1000 groups, it works just fine..
is there a maximum number of groups that an H5 structure supports?

am I doing some wrong?

  Looks fine to me. What version of the HDF5 library are you using?

  Quincey

Hi Paul and Quincey,

This is a known problem. We have some benchmarks results that show similar behavior (Neil ran benchmarks for 1.6.7 and 1.8.1.; see http://www.hdfeos.net/workshops/ws12/agenda.php, "Migrating from HDF5 1.6 to 1.8")
After 700000 groups HDF5 becomes really very slow unless latest file format is used (see below how to use it).

fapl_id = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_libver_bounds(fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);

fid = H5Fcreate(...,...,...,fapl_id);

Elena

···

On Mar 27, 2010, at 10:28 AM, Quincey Koziol wrote:

Hi Paul,

On Mar 26, 2010, at 5:24 PM, Paul Zumbo wrote:

Hello,

I am new to HD5.

I am trying to create many many number of groups ; however, the below piece of code hangs,

  chr1_id = H5Gcreate(file_id, "/chr1", H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
  for (i = 1; i <= 247249719; i++){
    k = sprintf(pos, "%d", i);
    group_id = H5Gcreate(chr1_id, pos, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
    status = H5Gclose(group_id);
  }

whenever i attempt to create, say, 1000 groups, it works just fine..
is there a maximum number of groups that an H5 structure supports?

am I doing some wrong?

  Looks fine to me. What version of the HDF5 library are you using?

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

I am using version 1.8.3.
I can install the latest version 1.8.4 & see if that changes the result....

···

this is my signature.

On Mar 27, 2010, at 11:28 AM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Paul,

On Mar 26, 2010, at 5:24 PM, Paul Zumbo wrote:

Hello,

I am new to HD5.

I am trying to create many many number of groups ; however, the below piece of code hangs,

   chr1_id = H5Gcreate(file_id, "/chr1", H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
   for (i = 1; i <= 247249719; i++){
       k = sprintf(pos, "%d", i);
       group_id = H5Gcreate(chr1_id, pos, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
       status = H5Gclose(group_id);
   }

whenever i attempt to create, say, 1000 groups, it works just fine..
is there a maximum number of groups that an H5 structure supports?

am I doing some wrong?

   Looks fine to me. What version of the HDF5 library are you using?

   Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Paul,

I am using version 1.8.3.
I can install the latest version 1.8.4 & see if that changes the result....

  That may help. BTW, why are you creating 247 million groups?

  QUincey

···

On Mar 27, 2010, at 11:22 AM, Paul Zumbo wrote:

>>> this is my signature.

On Mar 27, 2010, at 11:28 AM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Paul,

On Mar 26, 2010, at 5:24 PM, Paul Zumbo wrote:

Hello,

I am new to HD5.

I am trying to create many many number of groups ; however, the below piece of code hangs,

  chr1_id = H5Gcreate(file_id, "/chr1", H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
  for (i = 1; i <= 247249719; i++){
      k = sprintf(pos, "%d", i);
      group_id = H5Gcreate(chr1_id, pos, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
      status = H5Gclose(group_id);
  }

whenever i attempt to create, say, 1000 groups, it works just fine..
is there a maximum number of groups that an H5 structure supports?

am I doing some wrong?

  Looks fine to me. What version of the HDF5 library are you using?

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Good question.

I had a vision of a structure whereas, each chromosome in the human genome is a group, and each chromsome group is divided into bases, which are also groups; those base groups would be filed with datasets from multiple experiments at single base resolution...

in fact, if possible, I would like to have ~3.5 billion groups!

perhaps a structure like this isn't the best way to approach what I want, but...

-Paul

···

this is my signature.

On Mar 27, 2010, at 12:33 PM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Paul,

On Mar 27, 2010, at 11:22 AM, Paul Zumbo wrote:

I am using version 1.8.3.
I can install the latest version 1.8.4 & see if that changes the result....

   That may help. BTW, why are you creating 247 million groups?

   QUincey

this is my signature.

On Mar 27, 2010, at 11:28 AM, Quincey Koziol <koziol@hdfgroup.org> >> wrote:

Hi Paul,

On Mar 26, 2010, at 5:24 PM, Paul Zumbo wrote:

Hello,

I am new to HD5.

I am trying to create many many number of groups ; however, the below piece of code hangs,

chr1_id = H5Gcreate(file_id, "/chr1", H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
for (i = 1; i <= 247249719; i++){
     k = sprintf(pos, "%d", i);
     group_id = H5Gcreate(chr1_id, pos, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
     status = H5Gclose(group_id);
}

whenever i attempt to create, say, 1000 groups, it works just fine..
is there a maximum number of groups that an H5 structure supports?

am I doing some wrong?

Looks fine to me. What version of the HDF5 library are you using?

Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

A Saturday 27 March 2010 17:39:06 Paul Zumbo escrigué:

Good question.

I had a vision of a structure whereas, each chromosome in the human
genome is a group, and each chromsome group is divided into bases,
which are also groups; those base groups would be filed with datasets
from multiple experiments at single base resolution...

in fact, if possible, I would like to have ~3.5 billion groups!

perhaps a structure like this isn't the best way to approach what I
want, but...

Definitely, having 3.5 billion groups I don't think this can be considered the
best approach, at least with HDF5. Even if you use (as Elena suggest) the
latest file format, you still need around 1 KB/group, so 3.5 billion groups
will take 3.5 TB (and perhaps way more for keeping B-tree overhead), and this
for keeping just the *structure*.

I'd suggest to put more data on each dataset so that you can reduce the number
of groups to a minimum. With this, you will probably still have the B-tree
overhead, but with fine-tuned chunksizes for your datasets, this can be
reduced to a bare minimum. For an example on the kind of enhancement that you
can achieve, see:

http://www.pytables.org/docs/manual/ch05.html#chunksizeFineTune

Hope this helps,

···

--
Francesc Alted