cmake parallel HDF5 config scripts references hdf::hdf-shared

Hi,

Apologies if this is not the right place to post that question. I am trying to build OP2 library with HDF5 support on PPC64le system. I am trying with gcc-openmpi and xl-spectrum_mpi but I am encountering identical problems which is that application do not link against hdf5. I am insisting on using cmake for these builds due to my largely religious belief in reproducibility and unwaivering trust in modularity and creating long term fixes, sorry!

I use Lmod to switch between variants and cmake version is 3.12.

This is how I compile parallel HDF5

cmake \
    -DCMAKE_INSTALL_PREFIX=${INSTALL_PATH} \
    -DBUILD_STATIC_EXECS:BOOL=OFF \
    -DHDF5_BUILD_CPP_LIB:BOOL=OFF \
    -DHDF5_ENABLE_PARALLEL:BOOL=ON \
    ../hdf5-${VERSION} && make -j20 && make install

HDF5 compilation goes smoothly, but I am not sure if the choice of static_execs=OFF is necessary.

OP2 cmake config may be a bit outdated, but I would try to help them fix it, if at all possible. The library needs to be compiled and then the application have to be linked against that library. I don’t think there’s anything unusual in my cmake invocation. I request

cmake \
    -DCMAKE_INSTALL_PREFIX=${INSTALLDIR} \
    -DOP2_WITH_PARMETIS=ON \
    -DPARMETIS_DIR=${INSTALLDIR} \
    -DPARMETIS_TEST_RUNS=1 \
    -DOP2_WITH_PTSCOTCH=OFF \
    -DCUDA_NVCC_FLAGS='-arch=sm_60;-Xptxas;-dlcm=ca;-Xptxas=-v' \
    -DINSTALLATION_APPS_DIR='' \
    .. && make -j10 && make install

And everything goes fine. Side note, I am cheating on PARMETIS_TEST_RUNS. I never got cmake to detect it properly.

At this stage my superficial knowledge of cmake makes is difficult to understand what’s happening. I noticed that in ${INSTALL_DIR}/shared/cmake there’s something like a configuration file. This gets picked up when I compile the applications and I get the following error:

Linking CXX executable airfoil_sp_mpi_cuda
cd /home/rrs59-sxa03/build/op2/apps/c/build-dev/airfoil/airfoil_plain && /home/shared/apps/core/cmake/3.12.0/bin/cmake -E cmake_link_script CMakeFiles/airfoil_sp_mpi_cuda.dir/link.txt --verbose=2
/shared/opt/xlC/16.1.0.0-180411/xlC/16.1.0/bin/xlc++_r  -qthreaded -qthreaded -O2 -g -Wall  -Wno-long-long  -Wl,-rpath=/opt/ibm/spectrum_mpi/lib -Wl,-export-dynamic CMakeFiles/airfoil_sp_mpi_cuda.dir/sp/airfoil_mpi_op.cpp.o CMakeFiles/airfoil_sp_mpi_cuda.dir/sp/cuda/airfoil_sp_mpi_cuda_generated_airfoil_kernels.cu.o  -o airfoil_sp_mpi_cuda -Wl,-rpath,/home/shared/apps/mpi/xl/16.1/spectrum_mpi/10.2/op2/dev/lib: -lcudart /home/shared/apps/mpi/xl/16.1/spectrum_mpi/10.2/op2/dev/lib/libop2_mpi_cuda.so /usr/local/cuda/lib64/libcudart_static.a -ldl -lrt -lhdf5::hdf5-shared -lmpiprofilesupport -lmpi_ibm /home/shared/apps/mpi/xl/16.1/spectrum_mpi/10.2/op2/dev/lib/libparmetis.so /home/shared/apps/mpi/xl/16.1/spectrum_mpi/10.2/op2/dev/lib/libmetis.a 
/usr/bin/ld: cannot find -lhdf5::hdf5-shared
make[2]: *** [airfoil/airfoil_plain/airfoil_sp_mpi_cuda] Error 1

I do not understand what -lhdf5::hdf5-shared is, but I noticed that references to *-static and *-shared files appear in many places, yet I do not seem to have so files like this. There’s something I do not understand here at a pretty fundamental level.

Please advise,
Robert

The files created by hdf5 in the ${INSTALL_DIR}/shared/cmake folder are for use by the find_package cmake command.
find_package (HDF5 NAMES hdf5 COMPONENTS C shared) to use the shared libraries
find_package (HDF5 NAMES hdf5 COMPONENTS C static) to use the static libraries
The following variables will be set; ${HDF5_FOUND} if that type of component is found, and the corresponding static is ${HDF5_static_C_FOUND} and shared is ${HDF5_shared_C_FOUND} variables.
See the HDF5Examples project (A version should be with the HDF5 source file downloaded from the website) for and implementation of using these files in a CMakeLists.txt file.

The hdf5::hdf5-shared and hdf5::hdf5-static are CMake target names.

Allen

@byrn, thanks a lot for the reply. I have found HDF5Examples in my HDF5 src directory, but I cannot see the use of find_packages in CMakeLists.txt there. Everything is about linking.

But I have now read a bit more how find_packages work and I understand what NAMES and COMPONENTS is doing. Thanks, that was useful.

In OP2 cmake configuration I found this passage:

if(OP2_WITH_HDF5)
  if(NOT BUILD_SHARED_LIBS)
    set(HDF5_USE_STATIC_LIBRARIES)
  endif()
  find_package(HDF5)
  if(HDF5_FOUND)
    message(STATUS "HDF5 found")
    set(OP2_HDF5_DEFINITIONS ${HDF5_DEFINITIONS})
    set(OP2_HDF5_INCLUDE_DIRS ${HDF5_INCLUDE_DIRS})
    set(OP2_HDF5_LIBRARIES ${HDF5_LIBRARIES})
    # If HDF5 is build with parallel support it requires MPI, which needs
    # to be enabled
    if (HDF5_IS_PARALLEL)
      if (OP2_WITH_MPI)
        set(OP2_HDF5_INCLUDE_DIRS ${OP2_HDF5_INCLUDE_DIRS} ${MPI_INCLUDE_PATH})
        set(OP2_HDF5_LIBRARIES ${OP2_HDF5_LIBRARIES} ${MPI_LIBRARIES})
        # MPI is always built with HDF5 if available
        set(OP2_MPI_INCLUDE_DIRS ${HDF5_INCLUDE_DIRS} ${OP2_MPI_INCLUDE_DIRS})
        set(OP2_MPI_LIBRARIES ${HDF5_LIBRARIES} ${OP2_MPI_LIBRARIES})
      else()
        message(STATUS "HDF5 is built with parallel support requiring MPI, but MPI is disabled")
        message(STATUS "Disabling HDF5 support")
        set(OP2_WITH_HDF5 OFF)
      endif()
    endif()
  else()

I am guessing, that I need to change find_package line properly to get it to find static or shared accordingly to the setup.

Correct.
Likely change the following for setting the COMPONENTS part of find_package;
if(NOT BUILD_SHARED_LIBS)
set(HDF5_USE_STATIC_LIBRARIES)
endif()

Use message(STATUS" ${HDF5_xxx") for checking the variables you need.

Allen

Right, I am beginning to see how outdated this CMakeLists.txt is. The variables were all wrong. In particular I changed the references to HDF5_INCLUDE_DIRS to HDF5_INCLUDE_DIR and HDF_LIBRARIES to HDF5_EXPORT_LIBRARIES. That moved things forward a bit, I’ve also done what I think you suggested with components by doing the following:

if(NOT BUILD_SHARED_LIBS)
    # set(HDF5_USE_STATIC_LIBRARIES)
    find_package(HDF5 NAMES hdf5 COMPONENTS C static)
  else()
    find_package(HDF5 NAMES hdf5 COMPONENTS C shared)
endif()

I am not sure what to do with HDF5_DEFINITIONS as I can’t find a corresponding variable. Should I just delete it?

But I am still getting stuck on more or less the same error:

/shared/apps/gcc-6/6.4.0/bin/gcc -fPIC -std=c99 -O2 -g -DNDEBUG -Wl,-rpath=/home/shared/apps/compiler/gcc/6.4/openmpi/3.1.0/lib -shared -Wl,-soname,libop2_hdf5.so -o libop2_hdf5.so CMakeFiles/op2_hdf5.dir/__/core/op_lib_core.c.o CMakeFiles/op2_hdf5.dir/op_util.c.o CMakeFiles/op2_hdf5.dir/op_hdf5.c.o -lhdf5-static -lhdf5-shared -lhdf5_tools-static -lhdf5_tools-shared -lhdf5_hl-static -lhdf5_hl-shared -lmpi 
/usr/bin/ld: cannot find -lhdf5-static
/usr/bin/ld: cannot find -lhdf5-shared
/usr/bin/ld: cannot find -lhdf5_tools-static
/usr/bin/ld: cannot find -lhdf5_tools-shared
/usr/bin/ld: cannot find -lhdf5_hl-static
/usr/bin/ld: cannot find -lhdf5_hl-shared
collect2: error: ld returned 1 exit status

And the -static/-shared controversy is still there as I can see corresponding a/so files.

ls ${HDF_ROOT}/lib/
libhdf5.a      libhdf5_hl.so.100.2.0  libhdf5.so          libhdf5_tools.a           libhdf5_tools.so.101
libhdf5_hl.a   libhdf5_hl.so.101      libhdf5.so.100.2.0  libhdf5_tools.so          pkgconfig
libhdf5_hl.so  libhdf5.settings       libhdf5.so.101      libhdf5_tools.so.100.2.0

The find_package changes are correct.
Not sure what to do with HDF5_DEFINITIONS since cmake knows the defines from the find_package.
The hdf5-static and hdf5-shared references are cmake targets not library names.
The cmake file should use the targets in target_link_libraries commands;
Here is the text from the tools CMakeLists.txt:

add_library (${HDF5_TOOLS_LIB_TARGET} STATIC ${H5_TOOLS_LIB_SOURCES} ${H5_TOOLS_LIB_HDRS})
target_include_directories(${HDF5_TOOLS_LIB_TARGET}
    PRIVATE "${HDF5_TOOLS_LIB_SOURCE_DIR};${HDF5_SRC_DIR};${HDF5_BINARY_DIR};$<$<BOOL:${HDF5_ENABLE_PARALLEL}>:${MPI_C_INCLUDE_DIRS}>"
    INTERFACE "$<INSTALL_INTERFACE:$<INSTALL_PREFIX>/include>"
)
TARGET_C_PROPERTIES (${HDF5_TOOLS_LIB_TARGET} STATIC)
target_link_libraries (${HDF5_TOOLS_LIB_TARGET}
    PUBLIC ${HDF5_LIB_TARGET}
    PRIVATE $<$<BOOL:${HDF5_ENABLE_PARALLEL}>:${MPI_C_LIBRARIES}>
)

${HDF5_TOOLS_LIB_TARGET} is the tools library to be created (all target names defined in root CMakeLists)
PUBLIC ${HDF5_LIB_TARGET} line is where the hdf5 library is linked likely you would use: PUBLIC ${hdf5-static} (or hdf5-shared)

Thanks @byrn and sorry for being a bit slow. So I think I understand the difference between target name and library name. What I seem to be missing is why OP2 library picks one and doesn’t translate it into the other. Clearly the makefiles that are generated use target names.

The op2_hdf5 is compiled in externlib directory. The CMakeLists.txt there looks like this:

include_directories(${OP2_HDF5_INCLUDE_DIRS})
#add_definitions(${OP2_HDF5_DEFINITIONS})
add_library(op2_hdf5 ${COMMON_SRC} ${UTIL_SRC} op_hdf5.c)
target_link_libraries(op2_hdf5 ${OP2_HDF5_LIBRARIES})

# Add target to the build-tree export set
export(TARGETS op2_hdf5 APPEND
  FILE "${PROJECT_BINARY_DIR}/${OP2_TARGETS_EXPORT_SET}.cmake")

# Install
install(TARGETS op2_hdf5
  EXPORT ${OP2_TARGETS_EXPORT_SET}
  LIBRARY DESTINATION ${INSTALLATION_LIB_DIR} COMPONENT RuntimeLibraries
  ARCHIVE DESTINATION ${INSTALLATION_LIB_DIR} COMPONENT Development
)

When I print out ${OP2_HDF5_LIBRARIES} that are set in the master file, I am getting:

-- Configuring OP2 HDF5 library
-- In externlib OP2_HDF5_LIBRARIES: hdf5-static;hdf5-shared;hdf5_tools-static;hdf5_tools-shared;hdf5_hl-static;hdf5_hl-shared;/home/shared/apps/compiler/gcc/6.4/openmpi/3.1.0/lib/libmpi.so

So target names! I am worried it picked both -static and -shared even though I requested only the C and shared components in find_package. Also, it did not translate it into library names. Is it meant to?

target_link_libraries(op2_hdf5 ${OP2_HDF5_LIBRARIES})

that line looks correct as ${OP2_HDF5_LIBRARIES} should be CMake targets.

So how does OP2_HDF5_LIBRARIES gets initialized? (and should be only static or only shared)

Allen

These are, I think, the relevant initializaitons:

find_package(HDF5 NAMES hdf5 COMPONENTS C shared)
...
set(OP2_HDF5_LIBRARIES ${HDF5_EXPORT_LIBRARIES})

if (HDF5_ENABLE_PARALLEL)
    if (OP2_WITH_MPI)
        ...
        set(OP2_HDF5_LIBRARIES ${OP2_HDF5_LIBRARIES} ${MPI_LIBRARIES})

when I print out HDF5_EXPORT_LIBRARIES I get everything i.e.

HDF5_EXPORT_LIBRARIES=hdf5-static;hdf5-shared;hdf5_tools-static;hdf5_tools-shared;hdf5_hl-static;hdf5_hl-shared

What is really puzzling me is how you say that target_link_library is meant to use target library so hdf5-shared instead of just hdf5. If I try to manually overwrite it, it comes up with the same error saying cannot find -lhdf5.

Okay, I can see that part of the problem is HDF5_EXPORT_LIBRARIES, I think it was meant to be just HDF5_LIBRARIES. I changed it as I thought that it was the only relevant variable from the variables I printed out. I am using this to print everything:

find_package(HDF5 NAMES hdf5 COMPONENTS C shared)
get_cmake_property(_variableNames VARIABLES)
list (SORT _variableNames)
foreach (_variableName ${_variableNames})
    message(STATUS "${_variableName}=${${_variableName}}")
endforeach()

If I change it back to HDF5_LIBRARIES then the op2_hdf5 will get build but the applications build against will complain about undefined symbols and I can see that libop2_hdf5.so is not linked against hdf5. The only -l flag I can see in make VERBOSE=2 is -lmpi. I think this is because HDF5_LIBRARIES is empty.

HDF5_EXPORT_LIBRARIES is a list of everything packaged. From before:
find_package (HDF5 NAMES hdf5 COMPONENTS C shared) to use the shared libraries
find_package (HDF5 NAMES hdf5 COMPONENTS C static) to use the static libraries
The following variables will be set; ${HDF5_FOUND} if that type of component is found, and the corresponding static is ${HDF5_static_C_FOUND} and shared is ${HDF5_shared_C_FOUND} variables.
See the HDF5Examples project (A version should be with the HDF5 source file downloaded from the website) for and implementation of using these files in a CMakeLists.txt file.

What I didn’t say was that the libraries found will be in the following variables:
if (BUILD_SHARED_LIBS AND HDF5_shared_C_FOUND)
set (LINK_LIBS ${LINK_LIBS} ${HDF5_C_SHARED_LIBRARY})
else ()
set (LINK_LIBS ${LINK_LIBS} ${HDF5_C_STATIC_LIBRARY})
endif ()

Okay, I got the point about HDF5_EXPORT... and I now have the structure which links either static or shared library depending on what was found. The verbose make now runs the following library:

[ 32%] Linking C shared library libop2_hdf5.so
cd /home/rrs59-sxa03/build/op2/op2/c/build-ompi-gcc/src/externlib && /home/shared/apps/core/cmake/3.12.0/bin/cmake -E cmake_link_script 
CMakeFiles/op2_hdf5.dir/link.txt --verbose=2
/shared/apps/gcc-6/6.4.0/bin/gcc -fPIC -std=c99 -O2 -g -DNDEBUG -Wl,-rpath=/home/shared/apps/compiler/gcc/6.4/openmpi/3.1.0/lib -shared 
-Wl,-soname,libop2_hdf5.so -o libop2_hdf5.so CMakeFiles/op2_hdf5.dir/__/core/op_lib_core.c.o CMakeFiles/op2_hdf5.dir/op_util.c.o CMakeFi
les/op2_hdf5.dir/op_hdf5.c.o -Wl,-rpath,/home/shared/apps/mpi/gcc/6.4/openmpi/3.1/hdf5/1.10.2/lib: /home/shared/apps/mpi/gcc/6.4/openmpi
/3.1/hdf5/1.10.2/lib/libhdf5.so.100.2.0 -lmpi -ldl

Which looks correct to me as I can see a direct link to libhdf. The library builds, but now if I build application against that library I get errors about missing symbols such as H5open. I am afraid there’s a lot of misattribution going on my side as I wasn’t sure if the lib or the app was built properly, but I think I will focus now on the app assuming the lib is fine.

When I am building the apps there seems to be some interference with libop2_mpi.

/home/shared/apps/mpi/gcc/6.4/openmpi/3.1/op2/dev/lib/libop2_mpi.so: undefined reference to collect2: error: ld returned 1 exit status
`H5T_NATIVE_FLOAT_g'

By the way, OP2 is a DSL to make performance portable unstructured mesh applications for different type of parallelism. Most libs are wrapping some parallel functionality given by eithe OpenMP, MPI or CUDA. HDF5 integration is an important but not the main function.

Missing symbols usually mean a missing library on the link line. If read the previous post correctly, you need to link the app/lib with libop2_hdf5 and libhdf5.so? Building shared libraries requires linking with all the shared libraries involved - if the app/lib being produced calls a function in the dependent library.

Allen

I think I know what happened. HDF5 functions got lumped together with OP2 mpi libs. I am not sure if this intentional or not. Probably what’s happening is that the library gets linked against HDF always, whereas applications only if they need it, but as soon as they link against any MPI library then I run into the above problems.

This is the output of nm.

nm -u /home/shared/apps/mpi/gcc/6.4/openmpi/3.1/op2/dev/lib/libop2_mpi.so | grep H5open
                 U H5open

@byrn I think I am nearly there, but still may need a bit of help. At this point, I have managed to compile some HDF-enhanced OP2 applications, but I needed to do some manual fixes and I am trying to get rid of it now.

There are two issues. One is that the application cmake doesn’t account for MPI-HDF option. If the library is compiled with HDF support then libop2_mpi.so contains undefined symbols for HDF5. I need to check with the developers if that’s intentional or not.

Secondly, the library build process generates OP2LibraryDepends-relwithdebinfo.cmake which is later picked up by app build process. It contains the target rather than specific library file. I manually changed that to point to shared library,but there must be a better way. Shouldn’t the build process generate a link to specific so? Or should the app perform find package speciic component

set_target_properties(op2_hdf5 PROPERTIES                                                                                      [35/8816]
  IMPORTED_LINK_INTERFACE_LIBRARIES_RELWITHDEBINFO "hdf5::hdf5-shared;/home/shared/apps/compiler/gcc/6.4/openmpi/3.1.0/lib/libmpi.so"
  IMPORTED_LOCATION_RELWITHDEBINFO "/home/shared/apps/mpi/gcc/6.4/openmpi/3.1/op2/dev/lib/libop2_hdf5.so"
  IMPORTED_SONAME_RELWITHDEBINFO "libop2_hdf5.so"
  )

I would think (not knowing the OP2 app) that if it did a find_package it shouldn’t need to set_target_properties like that for the HDF5 info. The imported properties should be transitive - or maybe that is the old style cmake??

Allen

@byrn sorry, I had to drop this for a while, but I am trying to fix it again. What do you mean by “old style cmake”? There’s definitely some legacy here I am struggling with.

Also I am not sure if I explained it well enough. This is a two step process. I call cmake and make the library and then separately I call cmake+make to build the apps. The latter process ingests OP2LibraryDepends-relwithdebinfo.cmake.