Can we retire the Autotools?

Got it. Yeah, my only concern is that important, older machines can often be in use long after most people have moved on. I feel like this is especially true in the sciences. I recall being at a nuclear physics experiment where the magnet control program ran on a 486 running Windows 3.1 because the developer wrote low-level DOS code for that platform and then the source code was lost.

1 Like

If you must switch to CMake, please don’t require a version more recent than that in Debian stable (or even oldstable), else you’ll require people to compile it which, believe me, is not a pleasant experience.

We don’t check the generated files into maintenance branches

Why not? You can mark them as generated in .gitattributes so they don’t clog up PRs, you can put them in LFS if they’re largish (and configure is what, a few hundred KB?)

+1 for retiring autotools in favour of CMake.

+1 for @w.benger’s suggestions too. It’s much better to not do configuration-time introspection whenever possible (though of course sometimes there may be a need to fallback to that as a last resort). It’s just much friendlier for cross-complation in general and Universal Binaries on macOS specifically. I’m talking about things like introspecting the size of int, whether the system is big/little endian, whether some header or function exists. It’s much better do use __has_include like he said, or things like #ifdef __APPLE__

@mike.jackson if he’s saying what I’m saying, it’s not that we don’t still need CMake, it’s just that we shouldn’t use it for some of what it’s being used for currently.

1 Like

At NOAA we use the autotools build, but if HDF5 switches to CMake exclusively, we can deal with it. (How does the HDF5 spack build work?)

I understand the tension of maintaining two build systems - it’s what’s done on netcdf-c, netcdf-fortran, and PIO, all of which I contribute to. If only one build system were used, that would be a huge win.

I don’t understand how a package like HDF5 or netCDF could build without configuration-time introspection. Nor do I think that platform-dependent pre-processor symbols should be (re-)introduced into our codebases. We got away from that once, and should not go back.

Furthermore, shared library builds are very common and need to be supported, as well as static builds. Autotools is good at automatically supporting both, CMake can support both but needs a little more effort. Both are essential.

I should have started by thanking you, Dana Robinson @derobins, for this idea and your efforts! :wink:

HDF5 is vital software for so many systems! Keep up the good work!

2 Likes

The spack hdf5 package was switched to use CMake at least a year ago.

1 Like

I don’t understand how a package like HDF5 or netCDF could build without configuration-time introspection.

Probably it can’t, fully.

I just mean to say that where there is the choice of configuration-time introspection vs compile-time customization, that the latter should be preferred. A concrete example is bug #566 where a configuration-time program is built and run to look at what sizeof(long double) is. It’s just fundamentally at odds with cross-compilation. Better would be to have code like #if __x86_64__ ... #elif __arm64__, etc.

Of course it’s even better still to not have such conditionalizations at all where possible. This is easier than in the past, now that we have things like standard fixed-size integer types, C++11 STL threads, etc.

Nor do I think that platform-dependent pre-processor symbols should be (re-)introduced into our codebases. We got away from that once, and should not go back.

Do you have some example of where configuration-time introspection is preferable to compile-time customization?

1 Like

+1 from me on removing Autotools support. I know what a pain it can be to maintain these parallel structures.

CMake is slowly increasing it’s footprint in the scientific community, so it’s a safe choice. For those systems where a recent CMake is not available out-of-the-box in repos (i.e RHEL 7 plus it’s compatible siblings such as Centos - they only carry very old CMake versions by default), it is a great advantage that Kitware ships very good and compatible CMake binary tarballs, the only required installation step is to tar -xf cmake-${CMAKE_VER}-linux-x86_64.tar.gz -C /usr/local --strip-components=1 and you have a perfectly working, updated CMake installation. Not all software packages are built in that way, but CMake is, and it makes it really easy to manage in automated build environments, pipelines and workflows.

1 Like

I think you mean its a safe choice for that particular community. HDF5 is used much more broadly than just the scientific community though and I think that is part of the conundrum here.

I keeping hoping it would be possible to deploy HDF5 source with the requisite cmake embedded within it such then when some developer out in the wild typed cmake ../hdf5/src the first thing it would do is check if that source tree included a newer cmake and then buit that cmake before proceeding with the cmake of HDF5 using the just built cmake. For that to work, older CMake already out there in the wild would have to behave that way and they don’t.

CMake is a compiled binary application, so distributing a working CMake with the HDF5 source for all major platforms will never be feasible.

My point is that a working and updated CMake is extremely easily available, though.

To be clear…I wasn’t proposing that HDF5 distribute binary CMake. I agree…that wouldn’t be feasible.

Yes, CMake is available everywhere. The challenge is that if HDF5 uses bleeding edge CMake functionality, that functionality will almost certainly not be available everywhere (maybe even anywhere) and the first think any package builder will be confronted with when they try to CMake HDF5 is that their CMake is too old.

If, however, CMake was designed to look for CMake source embedded in the to-be-built package, then if the already installed CMake is too old, it could opt to then try to build the embedded CMake (from source) and once done launch that CMake on the package.

But, already deployed CMake would have to be designed to do that and they aren’t. But, its certainly something KitWare could add and it would potentially go a long way to mitigate these kind of concerns.

Another nail in the Autotools coffin is their minimal, deprecated support for Java

https://www.gnu.org/software/automake/manual/html_node/Java.html

The one thing where auto tools may have advantage is cross compilation. It has some facility to extract values from compiled binaries so that those values do not need to be executed on the target hardware.

There was a fork attempting to make progress on that by MIT Professor Steven G. Johnson. I wonder if there is a way to incorporate those techniques into a cmake based build process.

Well, as an R package developer also as @grimbough, I also prefer to keep autotools and also agree with @j.j.green that configure script should always be pushed even to maintenance branches. You use .sh even for running the autotools why don’t you also create an .sh for checking the generated files, it is not that much trouble, plus you are testing that the autotools files are working.

As @grimbough said R, specially for Windows platform, already expects users to have a bash (through MSYS) installed in their machine, so the configure scripts works fine out of the box, but cmake is another story. I can name some really active and important scientists using R for windows who don’t have any clue how or what it even means to compile, so removing the autotools will just create another barrier to the open science. I don’t know though how important is open science for the HDFGroup though.

I think there is some confusion about CMake. The HDF5 developers could by policy only support a specific version of CMake that is known to be supported and usable on the various platforms and systems that are in production use. By going with bleeding edge CMake, the HDF5 developers would most certainly break a LOT of existing build systems. One could select a cmake version sufficiently old enough to have support around the community but new enough to have the bare minimum needed features to compile HDF5.

Maybe the R package developers could drum up the funding to keep the autotools code maintained and in place if they are the only folks that are demanding it’s inclusion in the HDF5 source?

@caiohamamura Wouldn’t R hide the cmake bits from the developer anyways? Why would someone using R need to know about compiling anything? Isn’t that the job of R to do that for me? I would argue that R not moving to CMake for their internal compilation is the barrier to open science. Look at something like VCPKG which basically is built on top of CMake. First thing it does is validate that a new enough version of CMake is available, if not, it pulls a version, compiles it and makes it available.

We do not automatically require the latest CMake version - we specify the minimum we require - currently that is 3.18 in develop. We do try to move the minimum after considering the advantages a newer CMake has (and this could be platform specific) and what looks to be available and in use at a few known customer sites. Yes some features may be only available in later CMake (presets for example) but those are optional.
When we released the 1.14 branch, the minimum was changed to 3.18 from 3.12. the other release branches stayed at 3.12. We do plan to add CI for testing the minimum CMake, it is on our todo.

@caiohamamura - We don’t push generated Autotools files like configure and Makefile.in to non-release branches because they generate a lot of repository churn when people create PRs on different platforms. It was a huge problem before we switched to not checking in generated files. Non-release branches aren’t for public consumption, anyway. Anyone puzzled by the lack of a configure file probably shouldn’t be building that branch in the first place.

1 Like

R package development is community based, although there are some companies behind some packages I am not in any of them. R packages are like python packages, they require that the package developers handle the compilation of a dynamic library which R will rely on, so that is not hidden at all from the developers. Actually the R core maintainers prefer that the package developers do not rely in any building system besides raw configure scripts and makefile. They do provide some containers and VMs that test if the developing packages can be installed within the containers/VMs and some of those do not have cmake buiding system installed and they do not provide any means for installing it.

I do agree though that they could/should assume cmake is a must have in most systems and so they should support it as a building system for R packages.

The most recommended way of linking external libraries now is using pkg-config and it works even for Windows systems because they provide their own flavor of mingw with some preinstalled stuff. Unfortunately HDF5 won’t work because it does not ship the *.pc.

Now linking to hdf5 in windows is working within R, there is actually a hdf5.pc file within pkgconfig directory now. So my vote is to keep up, at least the support for linking with autotools through pkg-config, I think it is okay to drop autotools option to compile hdf5 library itself, but support for hdf5.pc and pkg-config is a must.