I am attempting to install HDF5 (latest stable release, v1.14.4-3) with enable-parallel, but the testpar/t_mpi test in ‘make check’ fails.
This is with Intel OneAPI, v2021.9.0 on a AWS instance. My configure line is:
–with-zlib=$ZLIB --enable-fortran --enable-hl --enable-parallel --disable-shared
where ZLIB is where I installed the zlib compression library.
I currently don’t have the option of updating the Intel OneAPI version.
Would building a more recent version of MPI make any difference? I have read something related to ROMIO in an older version of Intel’s MPI is to blame. But if I install a newer MPI in a local dir (I’m unable to override system level installs), does it matter?
Thanks for that info. Please excuse my possible blunders, but I am a scientist trying to build an application that can greatly benefit from HDF5’s parallel I/O. My knowledge of “system administration” or software stack installation is seriously low.
For clarification, if I build MPICH v4.3.0 from scratch using the intel icc/ifort compilers, then put the resulting include and lib folders in CPPFLAGS and LDFLAGS when I try building HDF5, does that achieve an effective replacement to the pre-installed OneAPI MPI versions?
Yes, you can replace Intel’s default MPI with MPICH built from scratch using the oneAPI compiler. Here’s a breakdown of how to do it and key considerations:
Why Replace Intel MPI?
Flexibility and Customization: Building MPICH from scratch allows you to fine-tune it for your specific hardware and workload. You can enable or disable features, optimize for particular interconnects, and potentially achieve better performance.
Specific Features: MPICH might offer features or versions that are not yet available or prioritized in Intel’s MPI implementation.
Open Source: MPICH is open-source, which can be a preference for some users.
Steps to Replace Intel MPI with MPICH
Install oneAPI Base Toolkit: Ensure you have the Intel oneAPI Base Toolkit installed, as it provides the necessary compilers (icx, icpx) and libraries.
Download MPICH Source Code: Download the latest MPICH source code from the official website or a trusted mirror.
Configure MPICH: This is a crucial step. You’ll need to configure MPICH to use the oneAPI compilers. Here’s a general example:Bash./configure --prefix=/path/to/your/mpich/installation \ CC=icx CXX=icpx FC=ifort \ --with-device=ch4:ofi
Replace /path/to/your/mpich/installation with your desired installation directory.
CC=icx, CXX=icpx, and FC=ifort tell MPICH to use the oneAPI C, C++, and Fortran compilers, respectively.
--with-device=ch4:ofi enables support for the OpenFabrics Interface (OFI), which is commonly used for high-performance interconnects. Adjust this based on your network.
Build and Install MPICH:Bashmake make install
Set Environment Variables: After installation, set the necessary environment variables to point to your MPICH installation:Bashexport PATH=/path/to/your/mpich/installation/bin:$PATH export LD_LIBRARY_PATH=/path/to/your/mpich/installation/lib:$LD_LIBRARY_PATH
Test Your Installation: Compile and run a simple MPI program to verify that MPICH is working correctly.
Important Considerations
Compatibility: Ensure that the MPICH version you build is compatible with the other libraries and tools you are using.
Performance: While MPICH can be highly optimized, achieving the best performance might require careful configuration and tuning.
Support: If you encounter issues, you’ll need to rely on the MPICH community for support, as Intel might not provide direct support for MPICH.
Intel MPI Features: Be aware that some features specific to Intel MPI might not be available in MPICH.
Additional Tips
Compiler Flags: You might need to add specific compiler flags during the configuration step to optimize for your target architecture.
Documentation: Refer to the official MPICH documentation for detailed instructions and advanced configuration options.
Community Support: The MPICH community is a valuable resource for troubleshooting and getting help with your installation.
By following these steps and considering the important points, you can successfully replace Intel’s default MPI with MPICH built from scratch using the oneAPI compiler.
What is the actual error that the test returns? This most likely indicates an MPI set-up/configuration issue since this test is, for the most part, checking MPI functionality.
I successfully installed newer MPICH and then I build HDF5.
Now when I switch to the testpar directory and run ‘make check’ the t_mpi test succeeds! But the next one is “t_bigio” and fails with this (abbreviated) error message:
Single Rank Independent I/O
HDF5-DIAG: Error detected in HDF5 (1.14.4-3) MPI-process 0:
#000: H5D.c line 1371 in H5Dwrite(): can't synchronously write data
major: Dataset
minor: Write failed
I don’t understand what you are telling me. It’s broken? Then why is it part of the test suite? I see the same line in my CMakeTests.cmake file. Should it be commented out? How do I take action to resolve so that make check inside testpar subdirectory works?
I don’t understand what you are telling me. It’s broken? Then why is it part of the test suite? I see the same line in my CMakeTests.cmake file. Should it be commented out? How do I take action to resolve so that make check inside testpar subdirectory works?
The test isn’t broken, but for different combinations of platforms and MPI implementations/versions, the test may hang. In general, we have no issues with recent MPI versions on the platforms we support, but there are various issues we know of on platforms we don’t support or with specific versions of MPICH or OpenMPI that we either document or workaround.
OK, thanks for the replies. I think - and hope - that I’m now able to build the remaining software stack (netCDF) to accomplish the end goal. I appreciate the help that got it to work!