Making my data set chunked (instead of contiguous) has slowed down writing massively!!
When defining the dataspace for the memory buffer being written make sure the rank matches that of the target storage array, even if that means having a depth of 1 in a particular direction.
Furthermore aim for a chunk size of approximately 1MiB to match the default cache size of 1MB. If it is expedient to use larger chunks the cache size should be altered to accomodate that.
Example: chunky.cpp (2.4 KB) )
( From the final post courtesy of @gheber. )
I have a chunked data set with dims 128 x 128 x 4, elements are uchar.
I would like to write a series of 128x128 images into the data set.
Initially I used a contiguous data set and wrote only 4 images, this yielded a write speed of approx. 200us per image.
I wish to record a variable number of frames, so i wish to make the data set extendable. As part of that I need to make the data set chunked. The chunks are 128x128x1.
This is the only change I make to my code to enable chunking:
dcpl_id = H5Pcreate(H5P_DATASET_CREATE); // dcpl_id is the Dataset Creation Property List which is used later by H5Dcreate2 to create the dataset. ret = H5Pset_chunk(dcpl_id, RANK, chunk_dims); assert(ret >= 0);
Here is my problem:
With this chunking the write speed per frame is approximately 2500us per image.
Note: I am not extending the data set, i am still only filling an existing data set.
Here is where it gets weird:
When I use the
H5S_ALL flag to define the memory data space for
H5Dwrite the speed is returned, even increased slightly to approximately 150us.
Defining the memory data space using
H5S_ALL uses the data set data space and hyperslab for the memory data space. This seems to make it sufficiently fast but obviously this is impracticle when the planned dataset is larger than memory (TB rather than GB).
I have tried:
Defining the memory data space hyperslab using count size or using block size (no change detected between the two).
I have tried not defining a hyperslab within the memory data space. The memory data space already defines the entire memory buffer I wish to write:
- 2500 us becomes 2300 us per image, so only an incremental improvement.
- This what I was doing when using the C++ highlevel api.
This is made more confusing because I have this working using the Cpp high level api and I am able to atain the speeds of 150-200us per image.
What is the C++ api doing that I am not?? (probably quite a lot…)
I am using the default property lists for everything apart from defining the chunk size.
Is there some property which could potentially be changed to return my desired speeds?
In particular I am thinking of the
I have modified the dataset creation property list to set the storage space allocation time to early.
This reduces the write time may be slightly to approximately 2200us per frame…