Hermes 1.0.0 has been released. Hermes is a multi-tiered I/O buffering platform which can be used to accelerate data access for large-scale scientific applications. This represents the first feature-complete release of Hermes.
For applications that produce data, Hermes intelligently makes initial data placement decisions using the Data Placement Engine. Hermes supports various data placement policies, each with different considerations to hardware characteristics and application I/O patterns. For high-bandwidth checkpoint-restart workloads, for example, Hermes can place data in the fastest available tiers. Data can be placed either locally on the node producing data, remotely, or both. After inital data placement, data can be re-shuffled in the hierarchy using the buffer organizer (BORG) and prefetcher. The BORG demotes data based on their observed access frequency and last time accessed. For checkpoint-restart workloads, demoting data can make space available in high-performing tiers to accelerate future checkpoints.
For workloads which read data, Hermes can accelerate I/O through prefetching and data staging. Hermes provides a policy-based Prefetcher component that promotes data expected to be accessed in the near future. Prefetchers are policy-based in order to represent diverse application behaviors. We currently provide a prefetcher tailored for deterministic I/O workloads, which is fairly common. Deep learning applications, for example, have randomness seeds which can be used to make I/O behavior completely reproducable. Many scientific analysis codes predictably read a batch of data and then perform analysis. For these cases, Hermes comes equipped with an Apriori Prefetcher which parses and executes a user-defined schema file indicating when and where to prefetch data. In addition to prefetching, Hermes also provides data staging, which can import entire datasets from services external to Hermes (e.g., a PFS) and place them in the hierarchy for analysis.
We have evaluated Hermes 1.0.0 underneath various benchmarks and real applications. The Grey-Scott Model, for example, is a reaction-diffusion code that simulates the chemical reaction between two substances diffusing over time. We found that Hermes can improve I/O performance by 3x by intelligently buffering data in faster tiers and asynchronously flushing during checkpoints. A detailed summary of our benchmarks is located here.
We would like to thank the NSF for supporting our research. We invite the community to try Hermes and contribute. We would love to hear about use cases, desired features, and any improvements that we can make.