Hi Gerd,
Thans for the link to the document describing how chunked data are
read. It gives some good insights, but leaves me with a few questions.
1. I cannot imagine that the first step is reading a chunk from disk.
Doesn't it look in the chunk cache first? If not, what is the purpose of
the chunk cache?
2. I would like to know in more detail what reading the chunk means. I
assume it is doing a B-tree lookup to find out where the chunk is
located. What is involved in that step?
3. The diagram does not tell me why reading many small hyperslabs is so
much slower than reading a large hyperslab. Can it be that the B-tree
lookup is done over and over again, even if the chunk is in the cache?
Cheers,
Ger
"Gerd Heber" <gheber@hdfgroup.org> 5/15/2012 2:25 PM >>>
Mathieu, you should bear in mind that reading a dataset is logically a
mapping between dataspaces. The underlying physical layout in the file
is
irrelevant for this mapping. Users may not appreciate getting
different
answers when reading the nominally same datset with different physical
layouts.
Of course, not all layouts may give you the same performance.
If i understand well, a chunked dataset is read chunk by chunk
That's a misunderstanding. Sometimes that's the case, but not always.
Have a
look at
http://www.hdfgroup.org/HDF5/doc/Advanced/DataFlow_H5Dread/DataFlow_H5Dread.
pdf
Best, G.
···
-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org
[mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of mathieu.westphal@obs.ujf-grenoble.fr
Sent: Tuesday, May 15, 2012 5:23 AM
To: hdf-forum@hdfgroup.org
Subject: [Hdf-forum] H5Dread and array organization
Hello
I have a chunked dataset of size 20 20 10.
Chunk size are : 10 10 1.
le'ts say i read an hyperslab of data defined by:
start={0 0 0}
count={4 4 4}
I read it into a 1-D array.
and i get
X Y Z
0 0 0
0 0 1
0 0 2
0 0 3
0 0 4
0 1 0
0 1 1
0 1 2
0 1 3
0 1 4
0 2 0
0 2 1
...
0 4 3
0 4 4
1 0 0
1 0 1
...
If i understand well, a chunked dataset is read chunk by chunk, so i
cannot
understand how i can obtain this kind of order without reordering
completelly the data. A unique chunk cannot contain two diferent Z..
So,
Is this normal? do HDF5 reorder data (wasting time and ressources )?
Is there anyway to control this order? (not row-major, but let's say
Z-major..)
Thanks for helping.
Mathieu
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org