Maximum number of datasets in a group with SWMR ?

Dear community,

I am developing an application using HDF5 version 1.10.1 on Win64.
I encountered a problem when the number of datasets inside a group exceeds a certain limit.
Let’s say that I want to create 10 groups, each of them containing 6052 small (2x2 matrix of doubles) datasets.

No matter what, if the number of datasets is larger than 6051, then the 9-th group is not created by a call to H5Gcreate (returning -1).
The interesting thing is that this happens always when creating the 9-th group, no matter how many datasets
I created in the previous 8 groups (6052 or 100000…), as long as the number of datasets is > 6051 the problem appears.

Note that this problem happens if and only if I use the SWMR feature!

The following is a small test that reproduces the problem.
Tested on Windows 10 and Windows 8.1 64 bit, with HDF5 pre-build binaries (version 1.10.1).
Compiled with Visual Studio 2015.

C++ code:

#include <hdf5.h>
#include <iostream>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    /* status flag */
    herr_t status;

    /* file name */
    const char *file_name = "TEST.h5";

    /* buffer for group and dataset names */
    char group_name[1024];
    char dataset_name[1024];

    /* number of groups */
    int num_groups = 10;

    /* number of datasets per group */
    /* NOTE: fails if this number is larger than 6051!!! */
    int num_datasets_per_group = 6052;

    /*
     for test purposes. each dataset will be a 2x2 matrix
     with the following data
    */
    double data[4] = {1.0, 2.0, 3.0, 4.0};
    int num_dims = 2;
    hsize_t dims[2] = {2, 2};

    /*
     create and setup the file
    */

    hid_t h_file_proplist = H5Pcreate(H5P_FILE_CREATE);
    status = H5Pset_link_creation_order(h_file_proplist, H5P_CRT_ORDER_TRACKED | H5P_CRT_ORDER_INDEXED);
    hid_t h_file_acc_proplist = H5Pcreate(H5P_FILE_ACCESS);
    status = H5Pset_libver_bounds(h_file_acc_proplist, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);

    hid_t h_file_id = H5Fcreate(file_name, H5F_ACC_TRUNC, h_file_proplist, h_file_acc_proplist);
    status = H5Fstart_swmr_write(h_file_id);

    hid_t h_group_proplist = H5Pcreate(H5P_GROUP_CREATE);
    status = H5Pset_link_creation_order(h_group_proplist, H5P_CRT_ORDER_TRACKED | H5P_CRT_ORDER_INDEXED);

    /*
     write all groups
    */
    for(int j = 0; j < num_groups; j++)
    {
        /*
         create the j-th group
        */
        sprintf(group_name, "GROUP_%i", j+1);
        hid_t group = H5Gcreate(h_file_id, group_name, H5P_DEFAULT, h_group_proplist, H5P_DEFAULT);
        if(group < 0) {
            std::cerr << "Cannot create group \"" << group_name << "\"\n";
            break;
        }

        /*
         write all datasets for the j-th group
        */
        for(int i = 0; i < num_datasets_per_group; i++)
        {
            sprintf(dataset_name, "DATASET_%i", i+1);

            hid_t h_space = H5Screate_simple(num_dims, dims, NULL);
            if(h_space < 0) {
                std::cerr << "Cannot create the space for dataset \"" << dataset_name << "\"\n";
                break;
            }

            hid_t h_dset = H5Dcreate(group, dataset_name, H5T_IEEE_F64LE, h_space, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT);
            if(h_dset < 0) {
                H5Sclose(h_space);
                std::cerr << "Cannot create dataset \"" << dataset_name << "\"\n";
                break;
            }

            status = H5Dwrite(h_dset, H5T_NATIVE_DOUBLE, H5S_ALL, H5S_ALL, H5P_DEFAULT, data);

            H5Dclose(h_dset);
            H5Sclose(h_space);

            if(status < 0) {
                std::cerr << "Cannot write to dataset \"" << dataset_name << "\"\n";
                break;
            }
        }

        /*
         close the j-th group
        */
        H5Gclose(group);
    }

    /*
     release resources
    */

    H5Pclose(h_group_proplist);
    H5Fclose(h_file_id);
    H5Pclose(h_file_acc_proplist);
    H5Pclose(h_file_proplist);

    return 0;
}

And this is the error I get if the
number of groups is larger than 8, and the number of datasets per group is larger than 6051.

HDF5 Error:

HDF5-DIAG: Error detected in HDF5 (1.10.1) thread 0:
  #000: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5G.c line 323 in H5Gcreate2(): unable to create group
    major: Symbol table
    minor: Unable to initialize object
  #001: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Gint.c line 161 in H5G__create_named(): unable to create and link to group
    major: Symbol table
    minor: Unable to initialize object
  #002: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5L.c line 1695 in H5L_link_object(): unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #003: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5L.c line 1939 in H5L_create_real(): can't insert link
    major: Symbol table
    minor: Unable to insert object
  #004: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Gtraverse.c line 867 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found
  #005: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Gtraverse.c line 639 in H5G_traverse_real(): traversal operator failed
    major: Symbol table
    minor: Callback failed
  #006: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5L.c line 1782 in H5L_link_cb(): unable to create new link for object
    major: Links
    minor: Unable to initialize object
  #007: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Gobj.c line 546 in H5G_obj_insert(): unable to delete link messages
    major: Symbol table
    minor: Can't delete message
  #008: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Omessage.c line 988 in H5O_msg_remove(): unable to remove object header message
    major: Object header
    minor: Can't delete message
  #009: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Omessage.c line 1164 in H5O_msg_remove_real(): error iterating over messages
    major: Object header
    minor: Object not found
  #010: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Omessage.c line 1326 in H5O_msg_iterate_real(): can't pack object header
    major: Object header
    minor: Can't pack messages
  #011: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Oalloc.c line 2317 in H5O_condense_header(): can't move header messages forward
    major: Object header
    minor: Can't pack messages
  #012: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5Oalloc.c line 1768 in H5O_move_msgs_forward(): unable to destroy flush dependency
    major: Object header
    minor: Unable to destroy a flush dependency
  #013: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5AC.c line 1986 in H5AC_destroy_flush_dependency(): H5C_destroy_flush_dependency() failed
    major: Object cache
    minor: Unable to destroy a flush dependency
  #014: C:\autotest\hdf5110-StdRelease-code-10vs14\build\hdfsrc\src\H5C.c line 3989 in H5C_destroy_flush_dependency(): Parent entry isn't pinned
    major: Object cache
    minor: Unable to destroy a flush dependency
Cannot create group "GROUP_9"
HDF5: infinite loop closing library
      L,T_top,P,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD

Any idea? Is there something wrong I am doing with SWMR?
Best regards

Massimo Petracca

Hi Massimo,
Can you test with the recently released 1.10.2 version?

	Quincey

Hi Koziol,
Thank you for your reply.
I just downloaded version 1.10.2 Win64 and the problem is still there.

Furthermore I notices another strange behavior (with both versions, and only when SWMR is ON):
As you can see from the code I posted

I named the groups with a pattern like “GROUP_” + incremental counter.
Now the same thing happens with other patterns:
“A_%i” -> OK
“AA_%i” -> FAILS
“AAA_%i” -> FAILS
“AAAA_%i” -> FAILS
“AAAAA_%i” -> FAILS
“AAAAAA_%i” -> OK
That is, if I pick any pattern with 2-to-5 characters + underscore + incremental counter.
This happens also if a change the underscore with “-” (minus character).
It works instead if I omit the “_” or “-” separators, or if I use other patterns like:
“AAAA[%i]”.

This looks very strange since it seems to depend on multiple factors:

  1. use of SWMR
  2. many datasets with same pattern in a group
  3. many groups with same patterns as described above.

I Hope these information can be useful to find the solution.

Thank you.

OK, that helps move us forward in “time” within the code. Are you able to build and test a version of the code from the git repo? There’s been a lot of updates to the code in the last month, some of which are SWMR related, so it would be good to check with the most recent changes.

Quincey