How exactly is async-i/o available in HDF5?

Follow-Up to this thread

I am still somewhat confused about how async-i/o is actually made available in HDF5.

First up; I`m tasked with evaluating the performance of different HDF5 features such as subfiling and async in a HPC context for the DKRZ (Deutsches Klimarechenzentrum GmbH Hamburg) as my internship. Prior knowledge of HDF5 was basically non-existent. For this purpose I am to use an environment mostly dependent on spack.

Through spack there are versions of HDF5 available such as hdf5 which has a configuration option for subfiling to enable support by default.

Now for async-i/o I have followed the guide here and install the hdf5-vol-async available through spack with it`s default configuration.

Using async-i/o some questions arose for me, after I had fixed my previous issue from the aforementioned thread. This might all come down to be overlooking the relevant sections that detail the proper usage of vol-async but I just can`t seem to piece it together properly.

In order to use the vol-async it needs to be installed through spack, loaded and then a couple of environmental variables have to be set in order to use it.

The docs mention that:
HDF5-1.13 and later versions added a number of new APIs that allow applications to take advantage of the asynchronous I/O feature provided by the asynchronous I/O VOL connector. They are part of the HDF5 public header, so users only need to include the HDF5 header file (hdf5.h) to use them.

I had assumed HDF5-Async APIs and Async VOL APIs to be one and the same. Is this not the case? Swapping all calls to be extended by _async is easy and straight forward but getting the vol-async to work properly has been problematic if I try to load the spack package myself and setting the environmental variables manually.

Does this mean hdf5-vol-async does not need to be installed separately and instead can be used right away with regular HDF5 as it`s been build in and available since HDF5-1.13? Or does it simply mean while hdf5-vol-async still needs to be installed and loaded separately adding a different header is not needed?

In the former case all my other questions basically become irrelevant as that would mean I have been trying to skin a cat 2 different ways and I do not need to care about installing hdf5-vol-async with spack, loading it and setting the environmental variables entirely.

First:
I currently encounter errors again if I set export HDF5_VOL_CONNECTOR="async under_vol=0;under_info={}" what exactly is this for and what does it do? Do I have to change anything for this to work properly or am I to just copy paste this and export accordingly?

In general do I even have to set the environmental variables when install hdf5-vol-async with spack? Because the actual section in the docs to set environmental variables only exists under the “Building from source code” section. But then if you continue in the docs and get to the implicit / explicit section it is mentioned as a requirement to set the environmental variables.

Second:
I have noticed that not loading hdf5-vol-async through spack and not setting it`s environmental variables works for creating and reading a file with _async functions.

My understanding is that HDF5 generally knows a vol and can refer to it but won`t do anything without the vol being present and will simply pass the api calls to that vol so how is it that creating a file using H5Dcreate_async() works without loading hdf5-vol-async? Does it detect the vol is not present and then used the regular synchronous call?

Third:

I feel like this is where I get things mixed up the most.

Since there sadly aren`t any real examples to show how one would go about implementing _async I have kinda started to piece together something for myself but I`m quite sure this incomplete understanding is the cause for most of my issues. For inspiration I looked at the tests the hdf5-vol-async-team has provided on github.

I noticed that the serial test do not feature any _async functions at all instead seemingly relying on what the team calls implicit mode; why exactly is that? Also sometimes a H5Pset_vol_async(async_fapl); is used which the documentation on vol mentions is a possible way to implement a vol but when exactly it is need to call it yourself or when exactly the vol will handle it for you I can not find in the hdf5-vol-async docs.

The parallel test all feature the explicit mode so what exactly is the difference in setting up a program for usage with either explicit or implicit mode? The docs do mention implicit should only be used for testing but then why is the implicit mode exclusively used in conjunction with serial while parallel features exclusively explicit mode?

Some help fully understanding how to actually use and implement hdf5-vol-async would be greatly appreciated.

Thanks

Hi @pphilippllucas,

several things to discuss here, so I’ll go in an order that I think makes sense. First, it may be helpful to review HDF5: HDF5 Virtual Object Layer (VOL) since it has information on what VOL connectors are and how they’re generally used. Essentially they are separate pieces of software from HDF5 which have the ability to completely remap how the HDF5 API works and interacts with some form of storage. The Async VOL is just one instance of a VOL connector which provides async I/O support for HDF5, but other connectors with support for async I/O exist, such as the DAOS VOL connector.

I had assumed HDF5-Async APIs and Async VOL APIs to be one and the same. Is this not the case?

My understanding is that HDF5 generally knows a vol and can refer to it but won’t do anything without the vol being present and will simply pass the api calls to that vol so how is it that creating a file using H5Dcreate_async() works without loading hdf5-vol-async? Does it detect the vol is not present and then used the regular synchronous call?

When we were initially developing the DAOS VOL connector, we needed to implement the ability for an HDF5 application to express async I/O, ideally in a fairly natural looking manner. So, the various async API functions were added. By default, the library uses what we call the “native” VOL connector, which is internal to the library and essentially just a wrapper over the library’s code before the VOL abstraction layer was introduced. When you make a call to an async API function using “regular” HDF5, the call is passed to the internal “native” VOL and just becomes synchronous, as you mention. When the call is passed to a VOL connector that has async I/O support, it’s responsible for handing an opaque request object back to the library that is inserted into the specified event stack and is then managed by the VOL connector. The VOL connector is also responsible for handling the other aspects of async I/O, like what happens when waiting on an event stack ID.

I currently encounter errors again if I set export HDF5_VOL_CONNECTOR=“async under_vol=0;under_info={}” what exactly is this for and what does it do? Do I have to change anything for this to work properly or am I to just copy paste this and export accordingly?

Generally, the “correct” way for applications to make use of a VOL connector would be to include some header file that the VOL connector distributes and make use of some function exposed by that header file to setup use of the VOL connector. For the Async VOL, see vol-async/src/h5_async_lib.h at develop · HDFGroup/vol-async · GitHub, for example. An application would call the function H5Pset_vol_async() on a File Access Property List ID that is created within an HDF5 application with a call similar to hid_t fapl_id = H5Pcreate(H5P_FILE_ACCESS);. Then, the FAPL ID would be passed to H5Fcreate() or H5Fopen() and access to that file would go through the Async VOL connector rather than the library’s internal “native” VOL. This is listed as step 6 at Building with Spack — HDF5 Asynchronous I/O VOL Connector 0.1 documentation (and is actually out of date, it should be #include "h5_async_lib.h" rather than #include "h5_async_vol.h"), but is listed as optional since the environment variable approach is the preferred method, at least for the Async VOL.

For being able to quickly test different VOL connectors, we added the environment variable approach where you set two environment variables and the library essentially handles the previous details for you by setting a particular VOL connector on the library’s default File Access Property List so that H5Fcreate() or H5Fopen() operations implicitly use a particular VOL connector.

First, you set the HDF5_VOL_CONNECTOR environment variable to the name of the VOL connector to load, where each VOL connector specifies its own unique name (see Registered VOL Connectors | The HDF Group Support Site). You can also include an optional parameters string in this environment variable string, separated from the connector name, which is what the under_vol=0;under_info={} part is. This parameter string is specific to each VOL connector and should be documented by the VOL connector’s author. In this case, the string just tells the Async VOL connector to “stack on top of” the library’s internal connector (under_vol=0) and pass no VOL-specific information down to it (under_info={}). The Async VOL is what we call a “passthrough VOL”, meaning that it basically relies on the library’s internal VOL connector to handle most of the storage details, while the Async VOL connector itself handles the async I/O details.

Second, you set the HDF5_PLUGIN_PATH environment variable to the directory where the VOL connector library was installed to. Depending on where a VOL connector is installed to, this might not be needed, but is still good practice to set anyway. If this environment variable is not set, HDF5 may not be able to find the VOL connector library and will fail at library initialization time if HDF5_VOL_CONNECTOR is set.

While the environment variable approach was mostly meant for testing, it has become the preferred method for many VOL connectors if only due to the convenience. It has limitations though, and VOL connectors may expose more powerful extensions to HDF5 through the header approach.

I noticed that the serial test do not feature any _async functions at all instead seemingly relying on what the team calls implicit mode; why exactly is that?

The Async VOL includes the ability to use an implicit form of async I/O to basically test it out without having to change an HDF5 application much or at all. By using that mode, you don’t have to use the _async APIs in HDF5 and the Async VOL can still handle certain operations asynchronously internally. I’m assuming that the implicit mode was used there simply because it makes reusing existing HDF5 test code with the Async VOL very simple. But this form of async I/O is very limited and the programmer has little control over it, so explicitly using the async I/O functions (explicit mode) is very much the preferred way of using the connector.

The docs do mention implicit should only be used for testing but then why is the implicit mode exclusively used in conjunction with serial while parallel features exclusively explicit mode?

I can’t be certain, as I didn’t write the tests, but I’d guess this is due to the Async VOL coming from authors with an HPC I/O background, where parallel use cases are more interesting. I’d say it would make sense to test both implicit and explicit mode in both the serial and parallel cases, but it was likely that the parallel case just had higher priority.

Since there sadly aren’t any real examples to show how one would go about implementing _async I have kinda started to piece together something for myself but I’m quite sure this incomplete understanding is the cause for most of my issues. For inspiration I looked at the tests the hdf5-vol-async-team has provided on github.

Yes, unfortunately as the library doesn’t really implement async I/O itself and relies on other connectors to handle it, we didn’t really create examples at the time as we should have. The best resource that I can point you to is hdf5/test/API/H5_api_async_test.c at develop · HDFGroup/hdf5 · GitHub, but this is also testing code so you sort of have to know what parts can be ignored. We should certainly create good examples for async I/O.

2 Likes

First up thank you so so much for all the help, clarifications and explanations you have provided.

That has basically been what I have been looking for now. Especially on some of the less visible aspects of what HDF5 does for the user. This answered some important questions for me and helped me greatly in getting a better grasp of things.

So already thank you very much, also for going through all my thoughts collected over the time working on this, even when they are probably all over the place.

but is listed as optional since the environment variable approach is the preferred method, at least for the Async VOL.

I see so it would still be better and preferred to use the environmental variables. Good thank you very much for the clarification. The only reason I started to doubt using them was that they are mentioned under “Building from source code” in the Docs but as you stated it is the “preferred” and mostly recommended way of using vol-async.

I’m assuming that the implicit mode was used there simply …

Alright then I was right to assume that it probably doesn`t mean anything and was most likely for convenience. Thank you for clarifying.

The best resource that I can point you to is hdf5/test/API/H5_api_async_test.c at develop · HDFGroup/hdf5 · GitHub

I had not found those test yet, thank you very much, will look into them.

Again thank you very much for the detailed response to my lengthy collection of questions and thoughts. It is very much appreciated as I think I can grasp the way this has to work a bit more clearly now and can start looking to fix up my implementation. Just getting these uncertainties out of the way for me helped a lot.

@jhenderson just one last question. Setting the environmental variables and enabling vol-async that way, instead of including a specific header #include "h5_async_vol.h" and calling H5Pset_vol_async(), is kind of hidden in a sense and requires for me or the later user to pay attention that they actually done these steps during different benchmark runs.

Now that you have clarified in case the vol-async isn`t present or has it`s variables set incorrectly, HDF5 will default to the synchronous API calls, is there an easy, straightforward way in HDF5 to check if a vol is present or stop HDF5 from making the conversion instead raising an error?

In the meantime I will read through “HDF5 Virtual Object Layer (VOL) Docs” again and check if my question has some answer there that I have missed.

The points under Registration might be what I`m looking for but I`m not certain.

I could check the environmental variables, though I want to know if there is a different, preferred way of doing so?

Thanks in advance.

Yes, this is one of the awkward issues with the environment variable approach. I wouldn’t say there’s a preferred way, but if you’d like a programmatic way, this should generally work:

#include "hdf5.h"

int
main(int argc, char **argv)
{
    hid_t default_vol_id = H5I_INVALID_HID;
    hid_t native_vol_id  = H5I_INVALID_HID;
    int   cmp            = -1;

    /* Get the ID of the VOL connector set on the default FAPL (H5P_DEFAULT) */
    if (H5Pget_vol_id(H5P_DEFAULT, &default_vol_id) < 0) {
        fprintf(stderr, "error\n");
        return -1;
    }

    /* Get the ID for the native VOL connector by its name */
    if ((native_vol_id = H5VLget_connector_id_by_name("native")) < 0) {
        fprintf(stderr, "error\n");
        return -1;
    }

    /* Compare the two VOL connectors */
    if (H5VLcmp_connector_cls(&cmp, default_vol_id, native_vol_id) < 0) {
        fprintf(stderr, "error\n");
        return -1;
    }

    if (0 == cmp)
        printf("Using the native VOL connector\n");
    else
        printf("NOT using the native VOL connector\n");

    H5VLclose(default_vol_id);
    H5VLclose(native_vol_id);

    return 0;
}

This checks the connector on the default File Access Property List by passing H5P_DEFAULT. If you have a particular File Access Property List you want to check, replace H5P_DEFAULT with your FAPL ID variable. If you want to check for the presence of a specific connector rather than the absence of the native connector, just replace “native” with the name of the connector you’re looking for (and update the printfs).

1 Like

Ok I have created a minimal example that I am still seeing errors with. This error only shows up when trying to create a file with async. Reading a regular hdf5 file created with synchronous methods using read_async works without issue.

It seems that something within the event set cannot be closed but this happens after H5ESwait() returned without issue already. Relevant line is Finish waiting for async, num in progess: 0, failed: 0, status: 0 which is just a prinf() for op_failed, num_in_progress and the status H5ESwait() returns.

Using the async VOL connector
Succeed with dset write
Succeed waiting for event set operations
Succeed with closing async_fapl
Succeed with closing dset_id
Succeed with closing file_id
HDF5-DIAG: Error detected in HDF5 (1.14.5) MPI-process 0:
  #000: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5ES.c line 634 in H5ESclose(): unable to decrement ref count on event set
    major: Event Set
    minor: Unable to decrement reference count
  #001: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1087 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1042 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5ESint.c line 194 in H5ES__close_cb(): unable to close event set
    major: Event Set
    minor: Close failed
  #004: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5ESint.c line 989 in H5ES__close(): can't close event set while unfinished operations are present (i.e. wait on event set first)
    major: Event Set
    minor: Can't close object
Error with closing es_id
Finish waiting for async, num in progess: 0, failed: 0, status: 0

Further the error stack then loops the following and eventually deadlocks.

HDF5-DIAG: Error detected in HDF5 (1.14.5) MPI-process 0:
  #000: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5VL.c line 892 in H5VLfree_lib_state(): can't free library state
    major: Virtual Object Layer
    minor: Unable to release object
  #001: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5VLint.c line 2201 in H5VL_free_lib_state(): can't free API context state
    major: Virtual Object Layer
    minor: Unable to release object
  #002: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5CX.c line 1122 in H5CX_free_state(): can't decrement refcount on DCPL
    major: API Context
    minor: Unable to decrement reference count
  #003: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1010 in H5I_dec_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #004: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 948 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
  [      ABT ERROR] async_dataset_close_fn H5VLfree_lib_state failed

ran with mpiexec -n 1 ./a.out. I have sadly not figure out the event set error handling yet.

void create_hdf5_async(int argc, char **argv)
{
    hid_t async_fapl = 0;
    hid_t dxpl_id = 0;
    hid_t file_id = 0;
    hid_t filespace = 0;
    hid_t memspace = 0;
    hid_t dset_id = 0;
    hid_t es_id = 0;
    herr_t status = -1;
    hsize_t dims[1];
    hsize_t count[1];
    hsize_t offset[1];

    int size = 20;

    /*
     * Initialize MPI
     */

    int mpi_size, mpi_rank;
    MPI_Comm comm = MPI_COMM_WORLD;
    MPI_Info info = MPI_INFO_NULL;

    int mpi_thread_required = MPI_THREAD_MULTIPLE;
    int mpi_thread_provided = 0;
    /* Initialize MPI with threading support */
    MPI_Init_thread(&argc, &argv, mpi_thread_required, &mpi_thread_provided);

    MPI_Comm_size(comm, &mpi_size);
    MPI_Comm_rank(comm, &mpi_rank);

    es_id = H5EScreate();
    if (es_id < 0)
    {
        fprintf(stderr, "Error with first event set create\n");
    }

    /*
     * Set up file access property list with parallel I/O access
     */
    async_fapl = H5Pcreate(H5P_FILE_ACCESS);
    status = H5Pset_fapl_mpio(async_fapl, MPI_COMM_WORLD, MPI_INFO_NULL);

    check_vol_async_present();

    /*
     * Create a new file collectively.
     */
    file_id = H5Fcreate_async("data/datasets/test_dataset_hdf5-c_async.h5", H5F_ACC_TRUNC, H5P_DEFAULT, async_fapl, es_id);
    if (file_id < 0)
    {
        fprintf(stderr, "Error with file create\n");
    }

    dims[0] = size;
    filespace = H5Screate_simple(1, dims, NULL);
    memspace = H5Screate_simple(1, dims, NULL);

    /*
     * Create the dataset with default properties and close filespace.
     */
    dset_id = H5Dcreate_async(file_id, "/X", H5T_IEEE_F64LE, filespace, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT, es_id);
    if (dset_id < 0)
    {
        fprintf(stderr, "Error with dset create\n");
    }

    /*
     * Initialize data buffer
     */
    float wbuf[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0};

    if (!wbuf)
    {
        fprintf(stderr, "Fatal: unable to allocate array\n");
        exit(EXIT_FAILURE);
    }

    dxpl_id = H5Pcreate(H5P_DATASET_XFER);
    H5Pset_dxpl_mpio(dxpl_id, H5FD_MPIO_COLLECTIVE);

    offset[0] = 0;
    count[0] = size;

    H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL, count, NULL);

    status = H5Dwrite_async(dset_id, H5T_NATIVE_FLOAT, H5S_BLOCK, filespace, dxpl_id, wbuf, es_id);
    if (status < 0)
    {
        fprintf(stderr, "Error with dset write\n");
    }
    else
        fprintf(stderr, "Succeed with dset write\n");

    /*
     * Close/release resources.
     */
    size_t num_in_progress;
    hbool_t op_failed;
    status = H5ESwait(es_id, H5ES_WAIT_FOREVER, &num_in_progress, &op_failed);
    if (status < 0)
    {
        fprintf(stderr, "Error waiting for event set operations\n");
    }
    else
        fprintf(stderr, "Succeed waiting for event set operations\n");
    printf("Finish waiting for async, num in progess: %ld, failed: %d, status: %d \n", num_in_progress, op_failed, status);

    status = H5Pclose(async_fapl);
    if (status < 0)
    {
        fprintf(stderr, "Error with closing async_fapl\n");
    }
    else
        fprintf(stderr, "Succeed with closing async_fapl\n");

    status = H5Pclose(dxpl_id);

    status = H5Dclose_async(dset_id, es_id);
    if (status < 0)
    {
        fprintf(stderr, "Error with closing dset_id\n");
    }
    else
        fprintf(stderr, "Succeed with closing dset_id\n");

    status = H5Fclose_async(file_id, es_id);
    if (status < 0)
    {
        fprintf(stderr, "Error with closing file_id\n");
    }
    else
        fprintf(stderr, "Succeed with closing file_id\n");

    status = H5ESclose(es_id);
    if (status < 0)
    {
        fprintf(stderr, "Error with closing es_id\n");
    }
    else
        fprintf(stderr, "Succeed with closing es_id\n");
}

full stacktrace:

HDF5-DIAG: Error detected in HDF5 (1.14.5) MPI-process 0:
  #000: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5VL.c line 892 in H5VLfree_lib_state(): can't free library state
    major: Virtual Object Layer
    minor: Unable to release object
  #001: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5VLint.c line 2201 in H5VL_free_lib_state(): can't free API context state
    major: Virtual Object Layer
    minor: Unable to release object
  #002: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5CX.c line 1122 in H5CX_free_state(): can't decrement refcount on DCPL
    major: API Context
    minor: Unable to decrement reference count
  #003: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1010 in H5I_dec_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #004: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 948 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
  [      ABT ERROR] async_dataset_close_fn H5VLfree_lib_state failed
HDF5-DIAG: Error detected in HDF5 (1.14.5) MPI-process 0:
  #000: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5P.c line 1468 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1087 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1042 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 948 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
HDF5-DIAG: Error detected in HDF5 (1.14.5) MPI-process 0:
  #000: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5VL.c line 892 in H5VLfree_lib_state(): can't free library state
    major: Virtual Object Layer
    minor: Unable to release object
  #001: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5VLint.c line 2201 in H5VL_free_lib_state(): can't free API context state
    major: Virtual Object Layer
    minor: Unable to release object
  #002: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5CX.c line 1122 in H5CX_free_state(): can't decrement refcount on DCPL
    major: API Context
    minor: Unable to decrement reference count
  #003: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1010 in H5I_dec_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #004: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 948 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)
  [      ABT ERROR] async_file_close_fn H5VLfree_lib_state failed
HDF5-DIAG: Error detected in HDF5 (1.14.5) MPI-process 0:
  #000: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5P.c line 1468 in H5Pclose(): can't close
    major: Property lists
    minor: Unable to free object
  #001: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1087 in H5I_dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #002: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 1042 in H5I__dec_app_ref(): can't decrement ID ref count
    major: Object ID
    minor: Unable to decrement reference count
  #003: /tmp/dev/spack-stage/spack-stage-hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/spack-src/src/H5Iint.c line 948 in H5I__dec_ref(): can't locate ID
    major: Object ID
    minor: Unable to find ID information (already closed?)

I think I know what is happening. I think I need 2 different H5ESwait(), correct?

As the first H5ESwait() will wait for

H5Fopen_async(es_id);
H5Dopen_async(es_id);
H5Dread_async(es_id);

but wont wait anymore for

H5Dclose_async(es_id);
H5Fclose_async(es_id);

which causes issues as the event set tried to close immediately while both closes are still to be executed.

Making so the first call to H5ESwait() waits for the file operations to finish and then the second H5ESwait() to wait for the closing operations to finish works, which does make a lot of sense actually as both closes get added to the event set again.

I then completely misunderstood your remark from the other thread about:

Looking back at your program, you should make sure that the calls to H5Dclose_async() and H5Fclose_async() come after the call to H5ESwait(). Otherwise, H5ESclose() should fail if there are any active operations going on in the event set being closed.

I had not thought about that those async operations need their own wait again as well.

Yes, or you could likely rewrite things so you only need one H5ESwait(), depending on what you want to do in your application. The way your example is written now, you would need an H5ESwait() call in between H5Fclose_async() and H5ESclose() to make sure the async close operations complete before closing the error stack. If you were to move the two synchronous H5Pclose() calls to the very end of the program, past H5ESclose(), you should only need one H5ESwait() call right before H5ESclose() after you’ve queued up all the async operations.

I then completely misunderstood your remark from the other thread about:

Sorry, that was probably a typo and before was meant! Especially since you already had the calls after H5ESwait(). It really depends on how you want to organize the application though. The main thing to keep in mind is that any time you call an async function, you’ll need to wait on the event stack ID you passed into it before you close the event stack. It might be useful to get in the habit of checking for events before closing the event stack. Something like:

H5ESget_count(es_id, &count);
if (count > 0)
    H5ESwait(...);
H5ESclose(es_id);
1 Like

Another question that came up for me. I seem to be running into an Argobots deadlock on a case by case basis. I can go like 10 or even 100 runs without encountering the ASYNC VOL ERROR but when I do I seem to deadlock hard.

My script does the following:

  1. mpiexec -n 4 ./a.out -b 5 -i 10 -f 1 -s 134217728 takes inputs and runs benchmark async (-b 4) on 4 ranks for example
  2. creates a 1GB file using async methods using the 4 ranks, each writes 1/4
  3. loops over reading the 1GB file back on ranks, each reading 1/4 and records the time taken
  4. exits once done

Also after each loop the script waits for all processes via MPI_Barrier() just to be safe

The error only happens during reading, never during the creation of the dataset. Can also happen if the dataset is created by 1 rank and read in full by each rank to eliminate me messing up the distribution logic. Just becomes much slower as a result.
It´s also much easier to force this to happen when increasing the number of iterations.

Error raised by the vol
[ASYNC VOL ERROR] get_n_running_task_in_queue_obj with ABT_mutex_lock

There are also some runs that show this, which might be related but just simply doesn`t deadlock as hard. It`s using one MPI process that fails at a time during the H5ESwait() call of that respective process. It just simply crashed and never finishes it`s execution, visible by all other ranks ending with Finish waiting for async, num in progess: 0, failed: 0, status: 0 which is just a printf for op_failed, num_process and the status of H5ESwait().

[leucht:13064] *** Process received signal ***
[leucht:13064] Signal: Segmentation fault (11)
[leucht:13064] Signal code: Address not mapped (1)
[leucht:13064] Failing at address: 0x120
[leucht:13064] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fb0678dd520]
[leucht:13064] [ 1] /home/dev/spack/opt/spack/linux-ubuntu22.04-x86_64_v4/gcc-11.4.0/hdf5-vol-async-1.7-vbevlbxizecl24eswhgtycejo4lbfvft/lib/libh5async.so(get_n_running_task_in_queue_obj+0x118)[0x7fb06483a758]
[leucht:13064] [ 2] /home/dev/spack/opt/spack/linux-ubuntu22.04-x86_64_v4/gcc-11.4.0/hdf5-vol-async-1.7-vbevlbxizecl24eswhgtycejo4lbfvft/lib/libh5async.so(+0x2306a)[0x7fb06485406a]
[leucht:13064] [ 3] /home/dev/spack/opt/spack/linux-ubuntu22.04-x86_64_v4/gcc-11.4.0/hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/lib/libhdf5.so.310(H5VL_request_wait+0x40)[0x7fb067e261b0]
[leucht:13064] [ 4] /home/dev/spack/opt/spack/linux-ubuntu22.04-x86_64_v4/gcc-11.4.0/hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/lib/libhdf5.so.310(+0x12ac9f)[0x7fb067bf4c9f]
[leucht:13064] [ 5] /home/dev/spack/opt/spack/linux-ubuntu22.04-x86_64_v4/gcc-11.4.0/hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/lib/libhdf5.so.310(H5ES__list_iterate+0x31)[0x7fb067bf5b41]
[leucht:13064] [ 6] /home/dev/spack/opt/spack/linux-ubuntu22.04-x86_64_v4/gcc-11.4.0/hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/lib/libhdf5.so.310(H5ES__wait+0x54)[0x7fb067bf5904]
[leucht:13064] [ 7] /home/dev/spack/opt/spack/linux-ubuntu22.04-x86_64_v4/gcc-11.4.0/hdf5-1.14.5-5tlcfqadprf4tpamuxv4tvn67bcastlj/lib/libhdf5.so.310(H5ESwait+0xc9)[0x7fb067bf35a9]
[leucht:13064] [ 8] ./a.out(+0x5216)[0x5623657d2216]
[leucht:13064] [ 9] ./a.out(+0x5c88)[0x5623657d2c88]
[leucht:13064] [10] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7fb0678c4d90]
[leucht:13064] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7fb0678c4e40]
[leucht:13064] [12] ./a.out(+0x2aa5)[0x5623657cfaa5]
[leucht:13064] *** End of error message ***
Finish waiting for async, num in progess: 0, failed: 0, status: 0 
Finish waiting for async, num in progess: 0, failed: 0, status: 0 
Finish waiting for async, num in progess: 0, failed: 0, status: 0 
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 13064 on node leucht exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------

What version of argrobots are you using, and what version of the async VOL?

I would also report this as an issue here: GitHub · Where software is built

Build from source with spack. Installed spack package version for hdf5-vol-async is 1.7. Argobots in spack is @ version 1.2

Will do, wasn`t sure if this was my fault.

Can you build it with the develop version of the vol, or at least 1.8.1. Develop is very stable, we were waiting until HDF5 2.0 was releases for a newer release of the VOL, but can be pushed out earlier.

Will try, currently stuck to spack as the DKRZ somewhat required it for this project. Might take me some time though to verify. Will make the issue now and then try to build develop or at least 1.8.1.

Spack supports building the “develop” version,
https://packages.spack.io/package.html?name=hdf5-vol-async

Spack also supports building the “main” version of argobots. The current testing of the VOL uses the “main” version.

oh right, missed that, will install now. sorry, wanted to get the information to you as quickly as possible and missed it completely that that was even an option.

@brtnfld ok swapped it now to develop, recompiled it and tried to be careful to not mess up.

I can`t seem to reproduce it right now but that might be due to develop being insanely slow in comparison to 1.7 so I can`t run through as many iterations to force it, which I guess is a good thing. Will keep you up to date if I get one run to fail.

Or rather performance is now highly variable which makes sense in an async context. Because there are some runs that are on par with 1.7 while others are just very behind.

vol-async 1.7
{"hdf5-c-async-read":{"0":0.051528,"1":0.051664,"2":0.047780,"3":0.056604,"4":0.044307,"5":0.043161,"6":0.049353,"7":0.045077,"8":0.042852,"9":0.049883}}

vol-async develop
{"hdf5-c-async-read":{"0":2.045801,"1":0.053664,"2":0.050616,"3":2.047102,"4":0.052851,"5":0.044249,"6":0.048339,"7":0.055475,"8":2.044778,"9":2.047081}}

Though my initial issue seems fixed with the version change. Thank you very much for the reminder. Since my results seems valid regardless of the slowdown I can work with that for now and present them as such.

Thank you very much for the swift answer and help.