H5Fclose hangs when using parallel HDF5

Hi,

We are trying to use parallel HDF5 to write data to a file.

The execution seems to be going well, but the H5Fclose hangs the execution.
I.e the following message is printed:
"All parallel hdf5 content written, closing file...\n"
but the
"Done"
never gets to be printed.

Any suggestion or amendment to the following code?
Thanks

Bruno Magalhaes, Blue Brain Project, EPFL, Lausanne CH [+41 2169 31805]

//------------------- BEGINNING OF SOURCE CODE --------------------

#include <stdio.h>
#include <cstdlib>

#include <mpi.h>

#define H5_USE_16_API 1
#include "hdf5.h"

#define NUMVALUES 19

using namespace std;

int main(int argc, char** argv) {

     MPI_Init(&argc, &argv);

     int mpiRank=-1, mpiSize=-1;
     MPI_Comm_rank(MPI_COMM_WORLD, &mpiRank);
     MPI_Comm_size(MPI_COMM_WORLD, &mpiSize);

     if (mpiRank==0) printf("Writing header...\n");

     int numOfNeurons = 100;

     hid_t file_id = 0;
     herr_t status;
     hsize_t dims[2];
     char filename[1024] = "/bgscratch/bmagalha/delete.h5";

     //Set up file access property list with parallel I/O access
     hid_t plist_id = H5Pcreate(H5P_FILE_ACCESS);
     H5Pset_fapl_mpio(plist_id, MPI_COMM_WORLD, MPI_INFO_NULL) ;

     //Create a new file collectively and release property list identifier.
     file_id = H5Fcreate(filename, H5F_ACC_TRUNC, H5P_DEFAULT, plist_id);
     H5Pclose(plist_id);

     if (mpiRank == 0) {
         //Initialize the attribute data.
         int info_version_data = 3;
         int info_numberOfFiles_data = 1;

         hsize_t info_dims = 1;
         hid_t filespace_h = H5Screate_simple(0,&info_dims, NULL);
         hid_t dset_id_info = H5Dcreate(file_id, "/info", H5T_NATIVE_FLOAT, filespace_h, H5P_DEFAULT);

         //Create the data space for the attribute.
         hid_t info_dataspace_id = H5Screate_simple(1, &info_dims, NULL);

         //Create a dataset attribute and count of files.
         hid_t info_version_id = H5Acreate(dset_id_info, "version", H5T_STD_I32BE, info_dataspace_id, H5P_DEFAULT);
         hid_t info_numberOfFiles_id = H5Acreate(dset_id_info, "numberOfFiles", H5T_STD_I32BE, info_dataspace_id, H5P_DEFAULT);

         //Write the attribute and files Countdata.
         H5Awrite(info_version_id, H5T_NATIVE_INT, &info_version_data);
         H5Awrite(info_numberOfFiles_id, H5T_NATIVE_INT, &info_numberOfFiles_data);

     //Create property list for individual dataset write (default).
         plist_id = H5Pcreate(H5P_DATASET_XFER);
     H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_INDEPENDENT);

     //to write data collectively, use:
// plist_id = H5Pcreate(H5P_DATASET_XFER);
// H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);

         //Close the attributes, dataspace and dataset.
         status = H5Aclose(info_version_id);
         status = H5Aclose(info_numberOfFiles_id);
         status = H5Dclose(dset_id_info);
         status = H5Sclose(info_dataspace_id);
     }

     if (mpiRank==0) printf("Writing body...\n");

     char name[512];
     float *allSynapsesSendBuff;
     for (int i = 0; i < numOfNeurons; i++) {

         dims[0] = 2000; // Create the dataset.
         allSynapsesSendBuff = new float[NUMVALUES * dims[0]];

     dims[1] = NUMVALUES;
     hid_t filespace = H5Screate_simple(2, dims, NULL);

     sprintf(name, "a.%d.%d", mpiRank, i);

     //Create dataset with default properties and close filespace
     hid_t dset_id = H5Dcreate(file_id, name, H5T_NATIVE_FLOAT, filespace, H5P_DEFAULT);

     //Create property list for independent dataset write
     hid_t plist_id = H5Pcreate(H5P_DATASET_XFER);
     H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_INDEPENDENT);

     H5Dwrite(dset_id, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL, plist_id, allSynapsesSendBuff);

     // End access to the dataset and release resources used by it.
     status = H5Dclose(dset_id);
     status = H5Sclose(filespace);
     status = H5Pclose(plist_id);
     }

     MPI_Barrier(MPI_COMM_WORLD);
     if (mpiRank==0) printf("All parallel hdf5 content written, closing file...\n");
     fflush(stdout);

     // Terminates access to the file.
     status = H5Fclose(file_id);
     MPI_Barrier(MPI_COMM_WORLD);

     if (mpiRank==0) printf("Done.\n");
     MPI_Finalize();
     return 0;
}

//------------------- END OF SOURCE CODE --------------------

Hi Bruno,

Hi,

We are trying to use parallel HDF5 to write data to a file.

The execution seems to be going well, but the H5Fclose hangs the execution.
I.e the following message is printed:
"All parallel hdf5 content written, closing file...\n"
but the
"Done"
never gets to be printed.

Any suggestion or amendment to the following code?

  Your problem is that you are modifying metadata independently (in the "if(mpiRank == 0)" block, which is not allowed (currently). You'll need to change that so that all the H5A/H5D calls in that block are executed collectively, so that all the processes have the same "view" of the file's state. (We are working to change this, but it'll be a while before it gets implemented)

  Quincey

···

On Sep 26, 2011, at 7:11 AM, Bruno Magalhaes wrote:

Thanks

Bruno Magalhaes, Blue Brain Project, EPFL, Lausanne CH [+41 2169 31805]

//------------------- BEGINNING OF SOURCE CODE --------------------

#include <stdio.h>
#include <cstdlib>

#include <mpi.h>

#define H5_USE_16_API 1
#include "hdf5.h"

#define NUMVALUES 19

using namespace std;

int main(int argc, char** argv) {

   MPI_Init(&argc, &argv);

   int mpiRank=-1, mpiSize=-1;
   MPI_Comm_rank(MPI_COMM_WORLD, &mpiRank);
   MPI_Comm_size(MPI_COMM_WORLD, &mpiSize);

   if (mpiRank==0) printf("Writing header...\n");

   int numOfNeurons = 100;

   hid_t file_id = 0;
   herr_t status;
   hsize_t dims[2];
   char filename[1024] = "/bgscratch/bmagalha/delete.h5";

   //Set up file access property list with parallel I/O access
   hid_t plist_id = H5Pcreate(H5P_FILE_ACCESS);
   H5Pset_fapl_mpio(plist_id, MPI_COMM_WORLD, MPI_INFO_NULL) ;

   //Create a new file collectively and release property list identifier.
   file_id = H5Fcreate(filename, H5F_ACC_TRUNC, H5P_DEFAULT, plist_id);
   H5Pclose(plist_id);

   if (mpiRank == 0) {
       //Initialize the attribute data.
       int info_version_data = 3;
       int info_numberOfFiles_data = 1;

       hsize_t info_dims = 1;
       hid_t filespace_h = H5Screate_simple(0,&info_dims, NULL);
       hid_t dset_id_info = H5Dcreate(file_id, "/info", H5T_NATIVE_FLOAT, filespace_h, H5P_DEFAULT);

       //Create the data space for the attribute.
       hid_t info_dataspace_id = H5Screate_simple(1, &info_dims, NULL);

       //Create a dataset attribute and count of files.
       hid_t info_version_id = H5Acreate(dset_id_info, "version", H5T_STD_I32BE, info_dataspace_id, H5P_DEFAULT);
       hid_t info_numberOfFiles_id = H5Acreate(dset_id_info, "numberOfFiles", H5T_STD_I32BE, info_dataspace_id, H5P_DEFAULT);

       //Write the attribute and files Countdata.
       H5Awrite(info_version_id, H5T_NATIVE_INT, &info_version_data);
       H5Awrite(info_numberOfFiles_id, H5T_NATIVE_INT, &info_numberOfFiles_data);

   //Create property list for individual dataset write (default).
       plist_id = H5Pcreate(H5P_DATASET_XFER);
   H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_INDEPENDENT);

   //to write data collectively, use:
// plist_id = H5Pcreate(H5P_DATASET_XFER);
// H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);

       //Close the attributes, dataspace and dataset.
       status = H5Aclose(info_version_id);
       status = H5Aclose(info_numberOfFiles_id);
       status = H5Dclose(dset_id_info);
       status = H5Sclose(info_dataspace_id);
   }

   if (mpiRank==0) printf("Writing body...\n");

   char name[512];
   float *allSynapsesSendBuff;
   for (int i = 0; i < numOfNeurons; i++) {

       dims[0] = 2000; // Create the dataset.
       allSynapsesSendBuff = new float[NUMVALUES * dims[0]];

   dims[1] = NUMVALUES;
   hid_t filespace = H5Screate_simple(2, dims, NULL);

   sprintf(name, "a.%d.%d", mpiRank, i);

   //Create dataset with default properties and close filespace
   hid_t dset_id = H5Dcreate(file_id, name, H5T_NATIVE_FLOAT, filespace, H5P_DEFAULT);

   //Create property list for independent dataset write
   hid_t plist_id = H5Pcreate(H5P_DATASET_XFER);
   H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_INDEPENDENT);

   H5Dwrite(dset_id, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL, plist_id, allSynapsesSendBuff);

   // End access to the dataset and release resources used by it.
   status = H5Dclose(dset_id);
   status = H5Sclose(filespace);
   status = H5Pclose(plist_id);
   }

   MPI_Barrier(MPI_COMM_WORLD);
   if (mpiRank==0) printf("All parallel hdf5 content written, closing file...\n");
   fflush(stdout);

   // Terminates access to the file.
   status = H5Fclose(file_id);
   MPI_Barrier(MPI_COMM_WORLD);

   if (mpiRank==0) printf("Done.\n");
   MPI_Finalize();
   return 0;
}

//------------------- END OF SOURCE CODE --------------------

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

I have a similar problem with H5Dread hanging for collective IO. Each MPI
process tries to read in a fraction of the dataset. If the dataset size is a
multiple of the number of processes (each process reads the same amount of
data) all is fine, otherwise the code below hangs in the H5Dread function.

Is it required for collective IO for all processes to read the same amount
of data?

Thanks.

//------------------- BEGINNING OF SOURCE CODE --------------------

    MPI_Comm comm = MPI_COMM_WORLD;
    MPI_Info info = MPI_INFO_NULL;

    int mpi_size, mpi_rank;
    MPI_Comm_size(comm, &mpi_size);
    MPI_Comm_rank(comm, &mpi_rank);

    herr_t status;

    hid_t plist_id = H5Pcreate(H5P_FILE_ACCESS);
    H5Pset_fapl_mpio(plist_id, comm, info);
    hid_t file_id = H5Fopen(filename.c_str(), H5F_ACC_RDWR, plist_id);
    CHECK_ERROR(file_id,"Error opening hdf5 file.");
    H5Pclose(plist_id);
    hid_t dataset_id;
#if H5Dopen_vers == 2
    dataset_id = H5Dopen2(file_id, name.c_str(), H5P_DEFAULT);
#else
    dataset_id = H5Dopen(file_id, name.c_str());
#endif
    CHECK_ERROR(dataset_id,"Error opening dataset in file.");

    hid_t space_id = H5Dget_space(dataset_id);
    hsize_t dims[2];
    H5Sget_simple_extent_dims(space_id, dims, NULL);

    hsize_t count[2];
    hsize_t offset[2];

    hsize_t item_cnt = dims[0]/mpi_size+(dims[0]%mpi_size==0 ? 0 : 1);
    hsize_t cnt = (mpi_rank<mpi_size-1 ? item_cnt :
dims[0]-item_cnt*(mpi_size-1));

    count[0] = cnt;
    count[1] = dims[1];
    offset[0] = mpi_rank*item_cnt;
    offset[1] = 0;

    hid_t memspace_id = H5Screate_simple(2,count,NULL);

    H5Sselect_hyperslab(space_id, H5S_SELECT_SET, offset, NULL, count,
NULL);

    plist_id = H5Pcreate(H5P_DATASET_XFER);
    H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);
    status = H5Dread(dataset_id, get_hdf5_type<T>(), memspace_id, space_id,
plist_id, ptr);

//------------------- BEGINNING OF SOURCE CODE --------------------

···

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/H5Fclose-hangs-when-using-parallel-HDF5-tp3369047p3415023.html
Sent from the hdf-forum mailing list archive at Nabble.com.

Hi Marius,

I have a similar problem with H5Dread hanging for collective IO. Each MPI
process tries to read in a fraction of the dataset. If the dataset size is a
multiple of the number of processes (each process reads the same amount of
data) all is fine, otherwise the code below hangs in the H5Dread function.

Is it required for collective IO for all processes to read the same amount
of data?

  No, and your code snippet below looks reasonable. What version of HDF5 are you using?

  Quincey

···

On Oct 12, 2011, at 2:30 AM, Marius Muja wrote:

Thanks.

//------------------- BEGINNING OF SOURCE CODE --------------------

   MPI_Comm comm = MPI_COMM_WORLD;
   MPI_Info info = MPI_INFO_NULL;

   int mpi_size, mpi_rank;
   MPI_Comm_size(comm, &mpi_size);
   MPI_Comm_rank(comm, &mpi_rank);

   herr_t status;

   hid_t plist_id = H5Pcreate(H5P_FILE_ACCESS);
   H5Pset_fapl_mpio(plist_id, comm, info);
   hid_t file_id = H5Fopen(filename.c_str(), H5F_ACC_RDWR, plist_id);
   CHECK_ERROR(file_id,"Error opening hdf5 file.");
   H5Pclose(plist_id);
   hid_t dataset_id;
#if H5Dopen_vers == 2
   dataset_id = H5Dopen2(file_id, name.c_str(), H5P_DEFAULT);
#else
   dataset_id = H5Dopen(file_id, name.c_str());
#endif
   CHECK_ERROR(dataset_id,"Error opening dataset in file.");

   hid_t space_id = H5Dget_space(dataset_id);
   hsize_t dims[2];
   H5Sget_simple_extent_dims(space_id, dims, NULL);

   hsize_t count[2];
   hsize_t offset[2];

   hsize_t item_cnt = dims[0]/mpi_size+(dims[0]%mpi_size==0 ? 0 : 1);
   hsize_t cnt = (mpi_rank<mpi_size-1 ? item_cnt :
dims[0]-item_cnt*(mpi_size-1));

   count[0] = cnt;
   count[1] = dims[1];
   offset[0] = mpi_rank*item_cnt;
   offset[1] = 0;

   hid_t memspace_id = H5Screate_simple(2,count,NULL);

   H5Sselect_hyperslab(space_id, H5S_SELECT_SET, offset, NULL, count,
NULL);

   plist_id = H5Pcreate(H5P_DATASET_XFER);
   H5Pset_dxpl_mpio(plist_id, H5FD_MPIO_COLLECTIVE);
   status = H5Dread(dataset_id, get_hdf5_type<T>(), memspace_id, space_id,
plist_id, ptr);

//------------------- BEGINNING OF SOURCE CODE --------------------

--
View this message in context: http://hdf-forum.184993.n3.nabble.com/H5Fclose-hangs-when-using-parallel-HDF5-tp3369047p3415023.html
Sent from the hdf-forum mailing list archive at Nabble.com.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

I'm using the version 1.8.4 (the one coming with ubuntu natty:
libhdf5-openmpi-1.8.4). I'm not able to reproduce the problem any
more, so for now the issue is solved.

Thanks,
Marius

···

On Thu, Oct 13, 2011 at 8:16 AM, Quincey Koziol <koziol@hdfgroup.org> wrote:

Hi Marius,

On Oct 12, 2011, at 2:30 AM, Marius Muja wrote:

Is it required for collective IO for all processes to read the same amount
of data?

   No, and your code snippet below looks reasonable\.  What version of HDF5 are you using?