Great! - I can see the dataset on HDF Lab now.
I notice the chunk layout is (97657, 4). If you are always going to reading entire rows, I’d suggest something like: (500, 1000). Basically for the type of selections you are doing, you want the number of chunks hit to be relatively small (but 1/chunk per selection is not optimal either - you want enough chunks that the HSDS parallelism has a chance to work).
I don’t think 2D - vs 3D will really matter in the end - do whatever best fits your application.
If the “hot spot” of data accessed is something reasonable (say less than the total amount of memory available in the cluster), HDD vs SDD vs S3 shouldn’t matter after the chunk cache is warmed up. You can have all data in ram and should have much faster access.
BTW, I’ll be hosting “Call the Doctor” on Sept 13 (Call the Doctor - Weekly HDF Clinic - #31 by lori.cooper). If that fits your schedule it would be great to have you register and we can have an informal discussion (and perhaps other participants will have some ideas as well).