Performance & Cost benchmarks on AWS?

Hello all,
I am new to HSDS, but after its availability for a few years now I am wondering if there have been any published benchmarks for AWS use?
I have seen the Local HSDS performance vs local HDF5 files - HSDS - HDF Forum (hdfgroup.org) post, however I was hoping for a comparison between AWS EC2/EBS with traditional H5py read times vs AWS EC2/S3 with HSDS.

In addition to speed, it would be nice to see a cost comparison as well that compares the AWS service costs between the two scenarios.

1 Like

Hi, @keith.knuth !

Although the following video does not directly talk about HSDS, you may find this video quite useful because I mostly talk about “Cloud-nomics” by comparing EC2 instances, S3/EBS/EFS, and, most importantly, cost($). You can jump to 6m and watch it until 16m 30s for real use cases that I experienced.

Thank you for that link Joe. Your research was helpful.
However, I am still looking for any feedback from any real-world HSDS users who went from a reading HDF5 directly on EC2/EBS scenario to HSDS using Lamba/S3. What did you see in terms of performance decrease?

Hi Keith,

It’s hard to give any general answers. A lot depends on your specific application and how the data is structured. In this week’s Call the Doctor session I showed an example where HSDS performed much better than HDF5Lib, but that was a scenario that took particular advantage of HSDS’ parallelism. On the other hand, each HSDS request involves more latency than the corresponding HDF5 lib call, so that can hinder peformance.

Besides performance and cost (yes these are important) there can be other motivating factors to look into HSDS:

  1. Data is automatically replicated on S3
  2. HSDS can be accessible from anywhere
  3. HSDS enables multiple writer/multiple reader applications
  4. A software crash will never leave the data in an unreadable state

Anyway, since the REST VOL (for C/C++) or h5pyd (for Python) use the same API’s, it should be fairly easy to test it out with your current application. I’d be interested to hear what the results are.