Hi Martin,
Have you set the chunk cache sufficiently large? Otherwise it will
reread the same chunks again and again. Allthough the system file cache
might hold all those data, I think it's better to size the cache
correctly because of the lookups HDF5 is doing.
E.g. in the case of (*,y,*,*) you'll need a cache of 601*8*61*1501
floats (1.64 GB). I assume have sufficient memory, otherwise you could
adjust the chunk size, especially in z,w.
Your chunks are not particularly large (16384 bytes) leading to a lot
of iops and a large B-tree to index the chunks. On the other hand, when
enlarging the chunks, you''ll need more memory for the chunk cache.
What is the pattern when accessing the data as *,*,z,w? First w, and
thereafter all z? You'll need a much smaller cache when accessing it
like
for w in 0:nw/ncw (nw is length of w-axis; ncw is chunk-size in
w)
for z in 0:nz/ncz
for w1 in 0:ncw
for z1 in 0:ncz
In this way you handle a full z,w chunk before moving to the next one,
so your cache size needs to be only 601*482*8*8.
I have a program testing 3D data sets of arbitrary size and chunk size
using a cache size depending on the chunk size and access pattern. If
you like to, I can send it.
Cheers,
Ger
Matthieu Brucher <matthieu.brucher@gmail.com> 6/12/2014 10:56 PM
Hi,
Unfortunately, this is indeed the worst you can have. It's completely
normal that you have the worst performance with slicing in these
dimensions. Even with a parallel filesystem, you would need to read
EVERYTHING from the dataset, and then the library would pick up the
pieces you need.
One solution would be to agglomerate several z,w in dimensions 5 and
6, so that you still get some performance, but it will be worse than 1
or even 2.
Cheers,
Matthieu
Hi all,
I'm working with floating point data building up a very large
dataset
typically >100Gb of four dimensions (x, y, z, w).
Dimensions are of the size (x,y,z,w) = (601, 482, 61, 1501) in my
example.
The aim is to slice (READING ONLY) this dataset in orthogonal
directions:
1) (x, *, *, *)
2) (*, y, *, *)
3) (*, *, z, w)
When using a contiguous layout I naturally get good performance for
directions (1) and (2), however it is very poor for (3).
Using a chunking layout of (8,8,8,8) seem to give the best balance so
far
for reasonable access times in all directions. but still not as fast
as I
was hoping for. My tests also show that compression improves the
read
performance slightly.
I'm looking for advise on possible optimization techniques to use for
this
problem other than what has been mentioned.
Otherwise, is my only option to move to some (expensive?) parallel
solution?
Thanks!
Regards,
Martin
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
···
2014-06-12 20:43 GMT+01:00 Martin Sarajærvi <balony@gmail.com>:
Twitter: https://twitter.com/hdf5
--
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
Music band: http://liliejay.com/
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5