Hi, I'm running into slow performance when selecting several
(>1000) non-consecutive rows from a 2-dimensional matrix, typically
~500,000 X 100. The bottleneck is the for loop where each row vector index
is OR'ed into the hyperslab, i.e.:
LOG4CXX_INFO(logger,"TIME begin hyperslab building"); //print out with
time stamp
//select file buffer hyperslabs
H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_SET, (const hsize_t*)
fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims);
for (hsize_t id = 1; id < numVecsToRead; ++id) {
LOG4CXX_INFO(logger, id << "/" << numVecsToRead);
fileOffset[0] = fileLocs1Dim[id];
H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_OR, (const hsize_t*)
fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims);
}
LOG4CXX_INFO(logger,"TIME end hyperslab building");
One interesting thing is the time between each loop increases between each
iteration, e.g. no time at all between 1-2-3-4-5, but seconds between
1000-1001-1002. So, the time to select the hyperslab is worse than linear,
and can become amazingly time consuming, e.g. >10 minutes (!) for a few
thousand. The read itself is very quick.
My current workaround is to check if the number of vectors to select is
greater than a heuristically determined number where it seems the time to
read the entire file (half a million row vectors) and copy the requested
vectors is less than the time to run the hyperslab selection. Generally the
number works out to ~500 vecs/0.5 seconds.
While poking around the code, I found a similar function,
H5Scombine_hyperslab() that is only compiled if NEW_HYPERSLAB_API is
defined. Using this significantly reduced the time of selection, in
particular the time for each OR-ing seemed constant, so 2000 vectors took
twice as long as 1000, not many times as with H5Sselect_hyperslab().
However, it's still 10s of seconds for few thousand vector selection, and
so it's still much quicker to read all and copy (~1/2 second).
Reading all and copying is not an ideal solution, as it requires malloc/free
~250MB unnecessarily, and if I use H5Scombine_hyperslab() the crossover
number goes up, i.e. more than 500, and it's less likely to be needed. I'm
a bit nervous however about using this undocumented code.
So...am I doing something wrong? Is there a speedy way to select a
hyperslab consisting of 100s or 1000s of non-consecutive vectors?
Is NEW_HYPERSLAB_API safe?
Thanks,
Ken