Hello all,
I have been using HDF5 to process/analyze large datasets, but I have run into some problems trying to import really large files. My experience has been that if I try to load over a certain sized data set, it either crashes or takes an extremely long time, and sometimes the data does not load at all.
So far I have done the following:
I have used h5py and the HDF5 tools to try importing the files.
I have checked to make sure the file format is correct and consistent with HDF5 format specifications.
My machine has plenty of RAM, and I have made sure I have an adequate amount of disk space.
I came across this website:https://forum.hdfgroup.org/t/recovering-dataset-cissp-training-in-hdf5-file-deleted-with-h5py/4349 but still facing issues.
I’m really wondering if this is a memory management issue, or if there is a more efficient way to handle very large files with HDF5. Could it possibly be an issue with chunking or compression? Also, any suggestions for optimizing the performance for reading/writing data in HDF5 large datasets would be really appreciated.
Has anyone else had similar experiences or insight on this?
Thank you for any advice!
Regards,
Megancharlotte