Performance of reading references

Hi all,

I need to read an array of object references. I find the function
H5Rget_name() quite slow. A loop like

    for (i = 0,refPR = refP; i < arrayLen; ++i,++refPR) {

        size = H5Rget_name((hid_t)loc_id, H5R_OBJECT, refPR, rName,
rname_buf_size);

    } /* for (i = 0...) */

takes about 19 seconds on my laptop for an array of length 3456 where
loc_id is the object id of the object reference array. On the other hand,

time h5dump Cells.AreaShape_Eccentricity.mat > /dev/null

is finished after 1.6 seconds. h5dump not only lists the names of each
reference of the object reference array (which is what the loop above is
supposed to do), but also each referenced dataset, each of which is a
float array with a couple of hundred elements. Still is is more than a
factor of 10 faster. Unfortunately the source code of h5dump is not
straight forward to read, so I hope someone can tell me the trick used
by h5dump to get such good performance.

Best regards,

Bernd

Hi Bernd,

···

On Nov 15, 2011, at 4:51 PM, Bernd Rinn wrote:

Hi all,

I need to read an array of object references. I find the function
H5Rget_name() quite slow. A loop like

   for (i = 0,refPR = refP; i < arrayLen; ++i,++refPR) {

       size = H5Rget_name((hid_t)loc_id, H5R_OBJECT, refPR, rName,
rname_buf_size);

   } /* for (i = 0...) */

takes about 19 seconds on my laptop for an array of length 3456 where
loc_id is the object id of the object reference array. On the other hand,

time h5dump Cells.AreaShape_Eccentricity.mat > /dev/null

is finished after 1.6 seconds. h5dump not only lists the names of each
reference of the object reference array (which is what the loop above is
supposed to do), but also each referenced dataset, each of which is a
float array with a couple of hundred elements. Still is is more than a
factor of 10 faster. Unfortunately the source code of h5dump is not
straight forward to read, so I hope someone can tell me the trick used
by h5dump to get such good performance.

  h5dump goes faster because it traverses the entire file, building a list of all the objects in the file. Then each name is pulled from that table when displayed later. Calling H5Rget_name individually will trigger a partial traversal of the file to generate the path name...

  Quincey