Severe slowdown when using chunks

Hi,

I noticed a serious issue this morning while tracking down some
performance problems in my code, which might affect other people here.
It seems that writing performance deteriorates significantly (roughly
a factor of 6) when the following conditions are met:

(1) a dataset is stored in a chunked layout,
(2) the memory dataspace and file dataspace have different ranks

I'm not aware of any warnings in the end-user documentation about
this. My application frequently performs writes from a single buffer
(say 10000 floats) to one "row" of a dataset (e.g. of dataspace (500,
10000)). If, in chunked mode, I use a rank-2 memory dataspace with
shape (1, 10000), it takes about 6 seconds to write, but if I use a
rank-1 memory dataspace with shape (10000), it takes 34 seconds (!).

I've verified this behavior occurs with HDF5 1.6.6 and 1.8.3. An
example C file is attached.

Andrew Collette

ctest.c (1.34 KB)

Hi Andrew,

···

On Jun 9, 2009, at 4:53 PM, Andrew Collette wrote:

Hi,

I noticed a serious issue this morning while tracking down some
performance problems in my code, which might affect other people here.
It seems that writing performance deteriorates significantly (roughly
a factor of 6) when the following conditions are met:

(1) a dataset is stored in a chunked layout,
(2) the memory dataspace and file dataspace have different ranks

I'm not aware of any warnings in the end-user documentation about
this. My application frequently performs writes from a single buffer
(say 10000 floats) to one "row" of a dataset (e.g. of dataspace (500,
10000)). If, in chunked mode, I use a rank-2 memory dataspace with
shape (1, 10000), it takes about 6 seconds to write, but if I use a
rank-1 memory dataspace with shape (10000), it takes 34 seconds (!).

I've verified this behavior occurs with HDF5 1.6.6 and 1.8.3. An
example C file is attached.

  Unfortunately, this is because the HDF5 library's "selection shape comparison" routine isn't smart enough to detect this situation and decide that the shapes are actually the same. We've just started some work on improving the comparison routine though, so hopefully future releases will see the same level of performance for both of these cases.

  Quincey