reading many hyperslabs of varying size in parallel / h5repack question

I need to read a large number of hyperslabs from a large dataset in parallel. The dataset is compressed with szip filter. The hyperslabs are non-overlapping and cover only a small part of the total dataset. The number of hyperslabs per processor and the size of each hyperslab is variable (although the total amount of data to be read by each processor should be roughly equal). What is the best way to do this? Right now I create one giant hyperslab by having each processor loop over the blocks it needs to read and execute:

  H5Sselect_hyperslab(selection,H5S_SELECT_OR,start,NULL,count,NULL);

Then, I read the selection with the transfer property H5FD_MPIO_COLLECTIVE enabled. This seems very slow, though. Would it be better to match up the blocks on each processor by size and read only one block per processor at a time? Or use a different transfer setting? Disable compression?

BTW, I noticed that running h5repack without any extra arguments really helps performance. However, it is incredibly slow. It took almost a week to defragment a 60GB file. The `top’ command says h5repack is using only 5% CPU, but the load is still 100%, which probably means it’s in system calls (seeking all over the file?). Is this “normal” behavior?

···

--
Mark

Re. reading hyperslabs, this is only slow if I use 1 core. When using 2,3,4,5,6,7, or 8 cores (on an 8-core machine), I seem to get super-linear speedup and a runtime that is roughly in the right ballpark (I have a previous implementation of this code that did not use HDF5 as a point of comparison). This is on an Ubuntu Linux machine, Intel compiler and MPICH-shmem implementation. Perhaps this is an odd caching issue. I tried this another dual quad-core Linux box with OpenMPI and I don’t observe this behavior there. Disabling SZIP compression turns out to be a bad idea, BTW.

···

On Sep 18, 2009, at 2:55 PM, Mark Moll wrote:

I need to read a large number of hyperslabs from a large dataset in parallel. The dataset is compressed with szip filter. The hyperslabs are non-overlapping and cover only a small part of the total dataset. The number of hyperslabs per processor and the size of each hyperslab is variable (although the total amount of data to be read by each processor should be roughly equal). What is the best way to do this? Right now I create one giant hyperslab by having each processor loop over the blocks it needs to read and execute:

  H5Sselect_hyperslab(selection,H5S_SELECT_OR,start,NULL,count,NULL);

Then, I read the selection with the transfer property H5FD_MPIO_COLLECTIVE enabled. This seems very slow, though. Would it be better to match up the blocks on each processor by size and read only one block per processor at a time? Or use a different transfer setting? Disable compression?

BTW, I noticed that running h5repack without any extra arguments really helps performance. However, it is incredibly slow. It took almost a week to defragment a 60GB file. The `top’ command says h5repack is using only 5% CPU, but the load is still 100%, which probably means it’s in system calls (seeking all over the file?). Is this “normal” behavior?
--
Mark

--
Mark