Hi,
For a project I need both a Python and a Fortran API for reading datasets written with H5PY. For the Python API, I create and array
# Approximate sizes: Nstations=200, Ncomp=3, Ndisp=3, Ncoor=~1000, Ntime=3600
dim = (Nstations, Ncomp, Ndisp, Ncoor, Ntime)
ds.create_dataset("displacement", dim, dtype="f")
Then I populate the stations iteratively like so
for i in stations:
ds['displacement'][_i, :,:,:,:] = stationdisplacement
This works totally fine within Python for writing and reading, and performance is in fact much better than I expected. But since Fortran is column first I tried to transpose the arrays and write it like so
dim = (Nstations, Ncomp, Ndisp, Ncoor, Ntime)
# Tranposing dimensions and station wise array
if fortran:
dim = dim [::-1]
stationsdisplacement = stationdisplacement.transpose((3,2,1,0))
# Creating to full dataset:
ds.create_dataset("displacement", dim, dtype="f")
for i in stations:
ds['displacement'][:,:,:,:,_i] = stationdisplacement
The problem really is the writing by the last index. Which I assume has to do with how HDF5 traverse the disk when writing (I tested putting the index first).
Is there a way that I can optimize this. Or, is there a way in the Python API to say write in fortran order/ in the Fortran API to say read in C order?
Preferably, I would like to avoid transposing on the Fortran side because it would require to transpose the entire array with all stations.
Thank you!
Cheers,
Lucas