Hi, HDF users,
I am trying to write a 1D array from several processes in an anarchic way.
Each proc has a subset of the array to write, but elements are not
contiguous and unsorted. Each proc knows the positions where it should
write each element.
With some help from the thread
http://hdf-forum.184993.n3.nabble.com/HDF5-Parallel-write-selection-using-hyperslabs-slow-write-tp3935966.html
I tried to implement it.
First, the master proc (only that one) creates the file:
// create file
hid_t fid = H5Fcreate( name.c_str(), H5F_ACC_TRUNC, H5P_DEFAULT,
H5P_DEFAULT);
// prepare file space
hsize_t dims[2] = {1, global_nElements};
hsize_t max_dims[2] = {H5S_UNLIMITED, global_nElements}; // not
really needed but for future use
hid_t file_space = H5Screate_simple(2, dims, max_dims);
// prepare dataset
hid_t plist = H5Pcreate(H5P_DATASET_CREATE);
H5Pset_layout(plist, H5D_CHUNKED);
hsize_t chunk_dims[2] = {1, global_nElements};
H5Pset_chunk(plist, 2, chunk_dims);
// create dataset
hid_t did = H5Dcreate(fid, "Id", H5T_NATIVE_UINT, file_space,
H5P_DEFAULT, plist, H5P_DEFAULT);
H5Dclose(did);
H5Pclose(plist);
H5Sclose(file_space);
H5Fclose( fid );
Then, all procs open the file and write their subset:
// define MPI file access
hid_t file_access = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_fapl_mpio( file_access, MPI_COMM_WORLD, MPI_INFO_NULL );
// define MPI transfer mode
hid_t transfer = H5Pcreate(H5P_DATASET_XFER);
// Open the file
hid_t fid = H5Fopen( name.c_str(), H5F_ACC_RDWR, file_access);
// Open the existing dataset
hid_t did = H5Dopen( fid, dataset.c_str(), H5P_DEFAULT );
// Get the file space
hid_t file_space = H5Dget_space(did);
// Define the memory space for this proc
hsize_t count[2] = {1, (hsize_t) local_nElements};
hid_t mem_space = H5Screate_simple(2, count, NULL);
// Select the elements for this particular proc (the `coords` array has
been created before)
H5Sselect_elements( file_space, H5S_SELECT_SET, local_nElements, coords
);
// Write the previously generated `data` array
H5Dwrite( did, H5T_NATIVE_UINT, mem_space , file_space , transfer, data
);
// Close stuff
H5Sclose(file_space);
H5Dclose(did);
H5Fclose( fid );
This version works but is VERY SLOW: more than 10 times slower than writing
with 1 proc without H5Sselect_elements.
Is this to be expected? Is there a way to make it faster?
Using H5Pget_mpio_actual_io_mode, I realized that it was not using
collective transfer, so I tried to force it using the following:
H5Pset_dxpl_mpio( transfer, H5FD_MPIO_COLLECTIVE);
But unfortunately, I get tons of the following error:
HDF5-DIAG: Error detected in HDF5 (1.8.14) MPI-process 0:
#000: H5Dio.c line 271 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 352 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 788 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish
linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 881 in H5D__link_chunk_collective_io(): couldn't
finish shared collective MPI-IO
major: Data storage
minor: Can't get value
#006: H5Dmpio.c line 1401 in H5D__inter_collective_io(): couldn't finish
collective MPI-IO
major: Low-level I/O
minor: Can't get value
#007: H5Dmpio.c line 1445 in H5D__final_collective_io(): optimized write
failed
major: Dataset
minor: Write failed
#008: H5Dmpio.c line 297 in H5D__mpio_select_write(): can't finish
collective parallel write
major: Low-level I/O
minor: Write failed
#009: H5Fio.c line 171 in H5F_block_write(): write through metadata
accumulator failed
major: Low-level I/O
minor: Write failed
#010: H5Faccum.c line 825 in H5F__accum_write(): file write failed
major: Low-level I/O
minor: Write failed
#011: H5FDint.c line 246 in H5FD_write(): driver write request failed
major: Virtual File Layer
minor: Write failed
#012: H5FDmpio.c line 1802 in H5FD_mpio_write(): MPI_File_set_view failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#013: H5FDmpio.c line 1802 in H5FD_mpio_write(): MPI_ERR_ARG: invalid
argument of some other kind
major: Internal error (too specific to document in detail)
minor: MPI Error String
The same happens with both HDF5 1.8.14 and 1.8.15
Any ideas how to fix this ?
Thank you
Fred