Pure Go HDF5 Library - Production-Ready Write Support (v0.11.4-beta)
Project: GitHub - scigolib/hdf5: Modern Pure Go implementation of the HDF5 file format
Latest Release: v0.11.4-beta (November 2, 2025)
Status: Production-ready for read operations, ~90% write support complete
Overview
I’m excited to share a major milestone for the HDF5 Go library - a pure Go implementation of the HDF5 file format with no CGo dependencies. After intensive development, the library has reached production-ready status for write operations with comprehensive feature coverage.
Notably, this implementation supports all current HDF5 file format versions (Superblock v0, v2, v3), making it ready for HDF5 2.0 when it arrives - any future format versions can be added as library updates without breaking the API. This positions it as a truly future-proof, ultra-modern HDF5 implementation.
What Makes This Implementation Unique
This is likely the most modern pure Go HDF5 implementation available today, featuring:
Future-Proof Architecture
- All HDF5 Format Versions Supported:
- Superblock v0 (HDF5 1.0-1.6) - Legacy format
- Superblock v2 (HDF5 1.8+) - Modern streamlined format
- Superblock v3 (HDF5 1.10+) - SWMR support
- Ready for HDF5 2.0 - Future format versions will be added in v1.x updates
- Backward & Forward Compatible - Read/write files from any HDF5 version
- Ultra-Modern Library - All formats supported from day one!
This means files created today will remain compatible when HDF5 2.0 arrives, and the library will support the new format through simple updates rather than major rewrites.
Advanced Performance Optimization
- Smart B-tree Rebalancing - Automatic optimization with 4 modes:
- Default - No rebalancing (like reference C library)
- Lazy - Batch processing (10-100x faster deletions)
- Incremental - Background rebalancing (zero pause time)
- Smart - Auto-tuning with workload detection
Manual and automatic rebalancing strategies allow users to optimize for their specific workloads - from append-heavy to deletion-intensive patterns.
Production-Grade Quality
- 86.1% test coverage overall, 77.8% for core package
- 8,000+ lines of professional integration tests
- 57 reference HDF5 files validated (across all format versions)
- Zero linter issues (34+ linters)
- Cross-platform - Linux, macOS, Windows
Comprehensive Documentation
- 2,700+ lines of detailed guides
- 4 working examples demonstrating different rebalancing strategies
- Complete API reference
- Architecture documentation following HDF5 C library patterns
Current Feature Support
Read Support (100% Complete)
- All HDF5 format versions (Superblock v0, v2, v3)

- All datatypes (integers, floats, strings, compounds, arrays, enums, references, opaque)
- All layouts (compact, contiguous, chunked)
- All storage types (compact, dense with fractal heap + B-tree v2)
- Compression (GZIP/Deflate)
- Object headers v1 (legacy HDF5 < 1.8) and v2 (modern HDF5 >= 1.8)
- Both traditional (symbol table) and modern (object header) groups
- Attributes (compact and dense storage)
Write Support (~90% Complete)
- Multiple format versions (Superblock v0 for legacy, v2 for modern)

- File creation (Truncate/Exclusive modes)
- Dataset writing (contiguous, chunked layouts, all datatypes)
- Group creation (symbol table, dense with automatic transition)
- Attribute writing (compact 0-7, dense 8+ with automatic transition)
- Attribute modification/deletion (complete lifecycle)
- Compression (GZIP/Deflate, Shuffle filter, Fletcher32 checksum)
- Advanced datatypes (arrays, enums, references, opaque)
- Dense storage Read-Modify-Write (full RMW cycle)
- Smart B-tree rebalancing with auto-tuning
- Legacy compatibility (Superblock v0 + Object Header v1)
- Free space management
Remaining Work (~10%)
- Soft/external links (hard links fully supported)
- Compound datatype writing (read works perfectly)
- Indirect blocks for fractal heap (direct blocks cover most use cases)
Why This Matters for the HDF Community
Ultra-Modern Foundation
Unlike implementations locked to specific HDF5 versions, this library was built with all format versions in mind from the start. When HDF5 2.0 arrives, support will be added through regular library updates (v1.x releases) without breaking existing code or requiring major API changes.
Performance Innovation
The Smart Rebalancing API is a unique feature not found in other HDF5 implementations. It addresses a common real-world problem: B-tree degradation under deletion-heavy workloads. Users can now:
- Let the library auto-detect their workload patterns
- Choose manual strategies for specific use cases
- Achieve 10-100x performance improvements for deletions
- Maintain compact, efficient file structures
Pure Go Benefits
No CGo dependencies means:
- Easy cross-compilation (compile for any platform from any platform)
- No C toolchain required (simpler deployment)
- Memory safety (Go’s garbage collector)
- Better debugging (no C/Go boundary issues)
- Easier maintenance (single language codebase)
Development Journey
The journey from concept to production took over a year - from studying the HDF5 format specification to implementing all the intricate details of the binary format for multiple format versions. Previous attempts using C bindings and other approaches were unsuccessful.
However, modern development technologies played a crucial role in the successful completion of this project. The combination of:
- Comprehensive HDF5 format specification (excellent documentation!)
- Reference C library implementation (invaluable reference)
- Modern development tools and practices
- AI-assisted development for testing and documentation
…made it possible to achieve what would have been extremely challenging or perhaps impossible otherwise. Without these modern technologies, completing this multi-version implementation in a year - or potentially ever - would not have been feasible. We’re grateful for the tools and technologies that made this possible. ![]()
Technical Implementation
Architecture
The implementation follows HDF5 C library patterns (H5Adense.c, H5Aint.c, H5Oattribute.c) while maintaining idiomatic Go code:
- Pure Go (no CGo dependencies)
- Version-aware parsing (automatic format detection)
- Buffer pooling for memory efficiency
- Context-rich error handling
- Signature-based format dispatch
- Table-driven tests with comprehensive scenarios
Format Version Support Strategy
// Read: Automatically detects and handles any supported version
file, _ := hdf5.Open("legacy_v0.h5") // Works!
file, _ := hdf5.Open("modern_v2.h5") // Works!
file, _ := hdf5.Open("swmr_v3.h5") // Works!
// Write: Choose format version for compatibility needs
fw, _ := hdf5.CreateForWrite("legacy.h5", hdf5.CreateTruncate,
hdf5.WithSuperblockVersion(0), // For HDF5 1.0-1.6 compatibility
)
fw, _ := hdf5.CreateForWrite("modern.h5", hdf5.CreateTruncate,
hdf5.WithSuperblockVersion(2), // Modern format (default)
)
Validation
- Round-trip testing (Go write → C library read → verify)
- h5dump compatibility validation across all format versions
- Real HDF5 reference files from various sources (v0, v2, v3)
- Integration tests with actual file I/O
Example Usage
Smart Rebalancing (Auto-Tuning)
package main
import "github.com/scigolib/hdf5"
func main() {
// Create file with smart auto-tuning rebalancing
fw, _ := hdf5.CreateForWrite("data.h5", hdf5.CreateTruncate,
hdf5.WithSmartRebalancing(
hdf5.SmartAutoDetect(true), // Detect workload patterns
hdf5.SmartAutoSwitch(true), // Switch modes automatically
),
)
defer fw.Close()
// Library automatically optimizes for your workload!
// 10-100x performance improvement for deletion-heavy operations
}
Full Read-Modify-Write Cycle
// Create and write
fw, _ := hdf5.CreateForWrite("data.h5", hdf5.CreateTruncate)
fw.WriteAttribute("/", "version", "1.0")
fw.Close()
// Reopen and modify - full RMW cycle works!
fw2, _ := hdf5.OpenForWrite("data.h5", hdf5.OpenReadWrite)
fw2.ModifyDenseAttribute("/", "version", "2.0")
fw2.Close()
// Verify with h5dump - shows updated value ✓
Multi-Version Compatibility
// Read any version automatically
oldFile, _ := hdf5.Open("1999_hdf5_v0.h5") // Superblock v0
modernFile, _ := hdf5.Open("2015_hdf5_v2.h5") // Superblock v2
swmrFile, _ := hdf5.Open("2020_hdf5_v3.h5") // Superblock v3
// All work seamlessly! Format version auto-detected.
Resources
- GitHub Repository: GitHub - scigolib/hdf5: Modern Pure Go implementation of the HDF5 file format
- Full Announcement: 🎉 Major Milestone: Write Support Nearly Complete! (v0.11.3-beta + v0.11.4-beta) · scigolib/hdf5 · Discussion #4 · GitHub
- Community Discussion: https://www.reddit.com/r/SciGoLib/comments/1omevqt/hdf5_go_library_write_support_90_complete/
- Installation:
go get github.com/scigolib/hdf5@v0.11.4-beta - Documentation: GitHub - scigolib/hdf5: Modern Pure Go implementation of the HDF5 file format
Roadmap to v1.0.0
v0.11.4-beta (Current) - Write support ~90% complete
- All current HDF5 versions supported ✨
↓
v0.11.5-beta - Soft/external links, indirect blocks
↓
v0.12.0-rc.1 - FEATURE COMPLETE (API frozen)
↓
v0.12.x-rc.x - Community testing (2-3 months)
↓
v1.0.0 STABLE - Production release (Late 2026)
- ALL HDF5 formats supported (v0, v2, v3)
- Ready for HDF5 2.0 (future updates)
- Long-term API stability guarantee
Our v2.0.0 will only happen if we need to change the Go API - not because of HDF5 format changes. When HDF5 2.0 arrives, it will be supported in our v1.x updates!
The path to v1.0.0 is clear - remaining work is primarily “finishing touches” rather than fundamental features.
Community & Feedback
I’d greatly appreciate feedback from the HDF community:
- Are there specific use cases or edge cases we should prioritize?
- What additional validation scenarios would be valuable?
- Interest in contributing or testing with real-world files across different format versions?
- Thoughts on the multi-version support strategy?
The project welcomes contributions, bug reports, and suggestions. This is an active open-source project with regular releases and improvements.
Special Thanks
Professor Ancha Baranova - This project would not have been possible without her invaluable support and assistance throughout the development process.
The HDF Group - Thank you for the excellent format specification, comprehensive documentation, and the C library reference implementation that made this pure Go implementation possible. The clear specification for multiple format versions was essential for building a future-proof library!
Project Status: Beta - Production-ready for read operations, write support advancing rapidly
Latest Version: v0.11.4-beta (November 2, 2025)
License: MIT
Platform: Cross-platform (Linux, macOS, Windows)
Format Support: HDF5 Superblock v0, v2, v3 (all current versions + ready for HDF5 2.0)
Looking forward to feedback and collaboration with the HDF community! ![]()
