Announcing development of the Enterprise Support Edition of HDF5

Hey, Sean.

Security patches go to ESE first and the to CE “periodically”? What does that
mean exactly?

It’s actually 2 things:

  1. Between each major release to Community Edition, there are interim quarterly + hotfix releases focused on maintenance, bugs, and security.
  2. The next major release to Community Edition then rolls up ALL the Enterprise Support Edition code (not just security patches).

So yes, the net effect is that security patches first flow to ESE and then eventually to CE.

Why? Because it takes a ton of work: isolating, duplicating, identifying, prioritizing, fixing, patching, testing, building new release, etc. For those that care about this predictable pace of security-focused work, Enterprise Support Edition provides the desired solution. For those that don’t need that, Community Edition will always be there.

Hi,
I understand that people are concerned. I presume, the HDF group also needs some money to continue development. A reasonable model is that HPC centers or whoever really needs tight support (because this means money to them) asks their vendor to provide the support, which in turn may purchase this from the HDF group.

In the best case, that does not take away any resources from the community version but instead increases the effort of the overall development accelerating it. In the past, it always has been best effort for them, like for most open source developers.

This won’t stop in the future, like it did never when they got awarded with a grant from a public body but that model does not scale any more because funding structure changed. A fraction of resources will in the future still be used for free support. If many benefit so much that they can pay even people that rarely use HDF benefit. In that sense, I’m not worried a bit as long as the code is open, and hope users ask from vendors for support if they really need it and vendors are clever enough to buy it if they can’t compete with the price. Alternatively, you can always become a developer or hire someone that works on your behalf on such open codes. The question is, will that be as cost efficient as outsourcing it to the HDF group?

That is my limited understanding of the rational process at least. David feel free to correct me here. I wish you luck to sustain the development. I also hope it can be well communicated that this will mean a chance for power users while it does not reduce development for the community - it will remain open source.

Regards

Julian

Option #1: all this means is that the Enterprise Support Edition subscription includes end-user assistance not just for HDF5 but also related key tools in the Python stack: h5py, PyTables, pandas, and xarray.

OK, great :). My reading of the word “support” was a little ambiguous since there are two meanings of that term.

I hope that this new initiative helps the long-term sustainability of the HDF group, which will of course benefit all downstream users.

I do have concerns about an intentionally slower release cycle for the community edition. A slow release cycle is not very encouraging for outside contributors to open source software – why would I contribute a bug report or patch when it will take up to 1-2 years to make it into a release for non-enterprise users? It is also quite common (for software in general, not HDF5 in particular) that initial major releases with new features are unusable due to new bugs. Without follow-on bug fix releases, the community edition might not be usable at all.

Fortunately HDF5 is pretty stable at this point, but I do think the stick of no bug/security fixes will encourage the scientific community to switch to alternative file-formats – or maybe even consider forking HDF5. This pace encourages workarounds in downstream libraries that use HDF5 rather than true fixes, which is not great for the ecosystem. No new features from the HDF group would be fine, but not releasing bug/security fixes sends a message to the scientific community that you don’t care about the value they provide.

Most of the responses I’ve seen on Twitter are along the same lines:
https://twitter.com/hdf5/status/993902684884426758
https://twitter.com/njgoldbaum/status/993915312843165696

Finally, about the work of making a release. Yes, I agree that maintaining software is a ton of work, and largely a thankless task. But in my experience, the pain of making a release is mostly proportional to the number of changes/bug-fixes that go into it. With appropriate automation (e.g., for the build process), the incremental work of issuing a release is pretty minimal. And if you’re not testing that build process on a frequent basis, you are going to have a very painful time when you discover everything in your tool-chain that has broken over the past 1-2 years all at once, rather than incrementally as it arises.

1 Like

Hi, Julian… thanks for the thoughtful comments.

A reasonable model is that HPC centers or whoever really needs tight support
(because this means money to them) asks their vendor to provide the support,
which in turn may purchase this from the HDF group.

I think that’s a great suggestion: HDF5 is standard software that ships on most HPC systems, largely because it’s considered “critical software” and just needs to run and run well (e.g. no one wants to buy an amazing laptop that can’t run a browser).

But the paradox is that while the HDF Group bears the full cost of making great software and supporting users when it doesn’t work, most HPC system integrators contribute nothing back to the HDF community: code, assisting users, investment, etc. In some cases, the HPC system integrator will actually charge the client for HDF support, even though they obviously can’t support or fix anything in the code.

In my opinion, HPC buyers and users can help by driving the following:

  • Does my HPC system include support for critical software like HDF?
  • If yes, then does my vendor actually work with the HDF Group or is this an empty promise?

This won’t stop in the future, like it did never when they got awarded with a grant
from a public body but that model does not scale any more because funding structure changed.

In general, we don’t get any grants. What we do have are wonderful and amazing clients who pay us for our consulting expertise in 3 key software areas: 1. HPC, 2. Big Data, 3. Metadata. In rare cases, our consulting work actually includes work inside the library, often to extend functionality. But typically, it’s to do work outside the library, which means that all the hard work inside the library has to be funded by us. We really are passionate about the work we do with our clients to support their mission: it’s just challenging to align to our mission of maintaining and extending HDF5.

I wish you luck to sustain the development. I also hope it can be well communicated
that this will mean a chance for power users while it does not reduce development
for the community - it will remain open source.

HDF5 Community Edition will remain free and open source, and it is my sincere belief that folks like you in the community will help advocate for us to get on a sustainable path.

Thanks!
– Dave

I’m also a bit worried about the split between ESE and CE.

So far the discussion was mostly about bug fixing, but I’m wondering new features will be made available. Probably they will first appear in ESE and only (much) later in CE.

I assume that some features added to the ESE might cause changes in the file format. If such features will only be added later to the CE version, such HDF5 files will not be readable by the CE version for some time. This would be very bad in cases where an organisation using EDE exports HDF5 files to clients using CE.

In particular I’m thinking of the SWMR feature which is being worked on and has quite some impact on the file format. I assume that the full SWMR feature will only appear in ESE and only later in CE. Most likely in the future more such cases will occur.

I’m also wondering about the cooperation between HDF5 and ADIOS. I understood it will be made possible that the system can read each others file format. Will that also only appear in ESE and later in CE.

Regards,

Ger van Diepen

Hello!

Thank you for bringing up this issue!

The HDF Group is committed to provide full file format compatibility between CE and ESE.

The features that require file format extensions will be released in CE first. All issues related to the file format compatibilities triggered by the addition of the new features will be addressed in CE in a timely manner to assure full compatibility.

Thank you!

Elena

Elena,

Thank you for clarifying that the HDF5 file format will always be compatible between ESE and CE.

However, new features will introduce new bugs, sometimes making a new feature unworkable. Since bug fixes will appear much later in CE, there is still quite a chance that a new feature (with changes in the file format) can be used in ESE, but not in CE. How will the HDF5 group address such issues?

Regards,

Ger

1 Like

Ger,

Thank you for good questions!

Ger_van_Diepen

May 14
Elena,

Thank you for clarifying that the HDF5 file format will always be compatible between ESE and CE.

However, new features will introduce new bugs, sometimes making a new feature unworkable. Since bug fixes will appear much later in CE, there is still quite a chance that a new feature (with changes in the file format) can be used in ESE, but not in CE. How will the HDF5 group address such issues?

Before any new feature is released there is an extended period of time when it is tested internally and is available for testing by the community. Our group also publishes design documents (RFC - Requests for Comments) and user documentation during development cycle to assist with the feature development, testing and adoption.

During 20 years of HDF5 existence we had 5 major releases (HDF5 1.2.0, 14.0, 16.0, 1.8.0. 1.10.0) and we rarely had issues with the new features per se. Software always have bugs/issues. During all these years we relied on the community to help us with finding the issues in the source and in the documentation. We hope that community will be even more involved now to make every CE release as stable as possible. We are also committed to resolve any critical bugs such as file format issues, data corruption issues, and backward/forward compatibility issues since the mission of our company is to provide free and open access to data stored in HDF5.

If any critical issue emerges The HDF Group developers will work with the community on the CE release that addresses the issue. This is our commitment.

All,

Please don’t hesitate to ask questions about CE and ESE. This model is new to us and we definitely didn’t think through all “use cases”. Having open and frank discussion and setting up expectations will help with the smooth transition.

Thanks again for your support!

Elena

As promised in our initial announcement to the community, The HDF Group will continue to provide more details on the coming launch of HDF5 Enterprise Support Edition (ESE) and HDF5 Community Edition (CE). Please join us this Friday, May 18th, from 1:00 PM - 2:00 PM. CT as David Pearah (@david.pearah), CEO of The HDF Group, presents an overview of our new, open source CE and ESE models.

Register now for the HDF5 Enterprise Support Edition (ESE) and HDF5 Community Edition (CE) Webinar on May 18, 2018, 1:00 PM CT at:

https://attendee.gotowebinar.com/register/7802714344088511233

After registering, you will receive a confirmation email containing information about joining the webinar.

If you are unable to attend or would like to join the discussion, please feel free to continue the conversation here on the forum. We appreciate your participation in these important discussions about the future of HDF5.

The HDF Group

The HDF Group CEO Dave Pearah presented the new HDF5 Enterprise Support Edition and HDF5 Community Edition in a community webinar on May 18, 2018—that video is attached below. In the interest of efficiency, we clipped the live Q&A session from the webinar and have instead compiled the questions we’ve received on Twitter, on the forum, and during the webinar into one continually evolving FAQ on our site: https://www.hdfgroup.org/faq-enterprise-support-edition/

We would love to have your contribution to the discussion. Please feel free to chime in here, on the forum, it’s the best avenue for the community and The HDF Group to respond and keep track of the discussion.

Currently the only way for a non HDF Group staff to submit a patch to HDF5 is via email / forum. Is the plan to allow external users accounts on your Bitbucket server so we can submit proper pull requests to HDF5 CE? I imagine this would happen in conjunction with opening up JIRA to external users.

Regards,
Christian

This is my number one pet peeve with HDF5. It’s immensely annoying with all these mail threads that end with “We’ve filed this as issue XXXX”, which is essentially a black hole to an outsider. There’s no way to subscribe to bugs, take part in or observe the process of their resolution or anything.

Elvis,

We have been working on opening HDF5 JIRA (20 years of issues :wink: to the public. We understand your disappointment, but we need time to accomplish the job. I hope the work will be completed by September 2018, but I cannot promise.

Thank you for your patience!

Elena

Yep Elena, I know you’re working on it, which is great. And I understand it probably takes some work considering the long history/established processes et.c.

All,

I would like to remind everyone that HDF5 is on GitHub and we encourage you to contribute to it. If you find a problem please feel free to open an issue on GitHub.

Our JIRA HDF5 tickets are open to the community. You need to register on our website to get access. If you know that an issue is there and you cannot see it, please send email to help@hdfgroup.org and we will fix permissions on the issue.

And finally, please remember that there is only one HDF5 core library and it stays open and free of charge.

Thank you!

Elena

1 Like

Security patches go to ESE first and the to CE “periodically”? What does that mean exactly? Sounds horrible.

I’m not sure what you are talking about, and maybe you aren’t either. There is no ESE. Nor is there a CE. There is only one HDF5 code base with multiple release branches. OK? G.