Can HDF5 parallel be used as a superset of the serial version

Hi,

On a multi-node cluster, is it necessary to have both an installation with “–enable-parallel” enabled and one without it enabled?

In our current installation we use Lmod modules, and we have a module named “hdf5” for serial versions that points to a build without the “–enable-parallel” flag, and we have a module named “hdf5parallel” for use with parallel applications.

The question is are these two distinct installations necessary or can we have just a parallel version for use with both serial and parallel applications?

Are there any downsides to having just one installation with “–enable-parallel”?

Thank you!

@raghu.reddy , building an HDF5 library version with --enable-parallel creates a library with, in some cases, very different code paths. For one, you have a dependence on MPI, but also certain feature combinations that are not currently supported in parallel, might throw confusing error messages. In other words, a single rank MPI application with a parallel version of HDF5 is not the same as a sequential library version running in the context of a single rank MPI job. You might be thinking that you aren’t using any parallel features, but the #ifdef H5_HAVE_PARALLEL macro will set the tone even for a single rank with a parallel build. If your users aren’t trying to exercise features that aren’t fully supported in parallel, then everything should behave as in the sequential case even with a parallel build.

In that sense, the HDF5 parallel version is NOT a superset of the serial version. They are close, and certainly share the file format, but the relationship is not that of a superset to a subset.

OK?

Best, G.

1 Like

@gheber Thank you so much for that response! That is very helpful to know the difference! Very much appreciated!