second pass of read is much faster?

steven_e_pav · February 17, 2010, 1:42am

hello;

I am writing a relatively large (~6G) hdf5 file using the 1.6.5 library, and
noticing some odd behaviour during read. The odd behaviour is that the *second*
time (and all successive times) I perform an identical read operation, it
proceeds much faster (a factor of 30x). Some details:.

* data is a cube, approximately 4000 x 18000 x 10, floats;
* b/c I cannot keep the cube in memory, I write the file with a 5 passes, each
of size 4000 x 18000 x 2; I am not using extendible datasets, as I know the
size of the cube in advance.
* the file is chunked, but this odd behaviour appears for most of the chunk
sizes I have tried (experimenting with this is slow b/c write times are slow)
* I am reading a hyperslab which is essentially 4000 x 1000 x 10, with the 1000
slices in the 2nd dimension randomly selected from the 18000 available; the
function I wrote to read in hyperslabs does a single hyperslab select per slice
in the 2nd dimension, then an H5Dread.
* If I read a given slab of size 4000 x 1000 x 10, then permute the requested
order of the 1000 2d slices, and perform another read, the 2nd read operation
is also faster than the first, but not as fast as if I make the same read
request twice in a row.

I assume there is something obvious and silly I am doing wrt chunking, chunk
size, caching, cache size, etc, but I am stumped as to why successive reads of
a file affect performance. Does the operation of reading alter the file? I am
closing the file after each read (or I think I am). Does this have anything to
do with the chunk bug? ( http://www.hdfgroup.org/newsletters/bulletin20090302.html )

any hints are appreciated....

other possibly irrelevant clues:

* I am setting nan as the fill value, and filling at alloc time (probably not
relevant)
* the target application will usually read i x j x k hyperslabs with contiguous
slabs in the first dimension, but somewhat randomly jumbled slabs in the 2nd
and 3rd dimensions.

--sep

[ Steven E. Pav {bikes/bitters/linux} nerd ]
[ a palindrome: stabler rosy sorrel bats ]

Mark_Moll1 · February 17, 2010, 4:18pm

Based on your signature, I’m assuming you’re using Linux. Most likely you’re just seeing the effect of RAM being used as a disk cache. That is usually a good thing. If you want a “cold” start, you need to drop the caches:

sudo echo 3 > /proc/sys/vm/drop_caches

If you don’t have root access, you can do something like this:

dd if=/dev/zero of=/tmp/dummy.dat count=25000000
cat /tmp/dummy.dat > /dev/null

Change the "count=25000000” to be large enough that it fill your RAM.

···

On Feb 16, 2010, at 7:42 PM, shabbychef@gmail.com wrote:

hello;

I am writing a relatively large (~6G) hdf5 file using the 1.6.5 library, and
noticing some odd behaviour during read. The odd behaviour is that the *second*
time (and all successive times) I perform an identical read operation, it
proceeds much faster (a factor of 30x). Some details:.

* data is a cube, approximately 4000 x 18000 x 10, floats;
* b/c I cannot keep the cube in memory, I write the file with a 5 passes, each
of size 4000 x 18000 x 2; I am not using extendible datasets, as I know the
size of the cube in advance.
* the file is chunked, but this odd behaviour appears for most of the chunk
sizes I have tried (experimenting with this is slow b/c write times are slow)
* I am reading a hyperslab which is essentially 4000 x 1000 x 10, with the 1000
slices in the 2nd dimension randomly selected from the 18000 available; the
function I wrote to read in hyperslabs does a single hyperslab select per slice
in the 2nd dimension, then an H5Dread.
* If I read a given slab of size 4000 x 1000 x 10, then permute the requested
order of the 1000 2d slices, and perform another read, the 2nd read operation
is also faster than the first, but not as fast as if I make the same read
request twice in a row.

I assume there is something obvious and silly I am doing wrt chunking, chunk
size, caching, cache size, etc, but I am stumped as to why successive reads of
a file affect performance. Does the operation of reading alter the file? I am
closing the file after each read (or I think I am). Does this have anything to
do with the chunk bug? ( http://www.hdfgroup.org/newsletters/bulletin20090302.html )

any hints are appreciated....

other possibly irrelevant clues:

* I am setting nan as the fill value, and filling at alloc time (probably not
relevant)
* the target application will usually read i x j x k hyperslabs with contiguous
slabs in the first dimension, but somewhat randomly jumbled slabs in the 2nd
and 3rd dimensions.

--sep

[ Steven E. Pav {bikes/bitters/linux} nerd ]
[ a palindrome: stabler rosy sorrel bats ]

--
Mark

steven_e_pav · February 17, 2010, 8:20pm

Based on your signature, I?m assuming you?re using Linux.

this is correct;

Most likely you?re just seeing the effect of RAM being used as a disk cache. That is usually a good thing. If you want a ?cold? start, you need to drop the caches:

sudo echo 3 > /proc/sys/vm/drop_caches

I am not sure if this is intended to make the first pass faster or successive
passes slower; however, after a
sync;echo 3 > /proc/sys/vm/drop_caches
as root, I reran my code and still have the same problem: the first pass is
~600x slower than the second pass.

recap/more clues:

* when performing a read of 4000 x 1000 x 10, the second read of the same data
is much faster. (alternatively, the first read is much slower than I would
expect). * successive reads of the same hyperslab are as fast as the second; if the
order of the 1000 2d slabs is permuted, the reads are still fast.
* if I attempt another read of a different set of 1000 2d slabs (among the ~18K
available), again the first read is terribly slow, while successive reads of
the same slabs are faster.
* I wrote the hyperslab writer and the hyperslab reader; both are suspect in
this case.

···

On Wed, 17 Feb 2010, Mark Moll wrote:

On Feb 16, 2010, at 7:42 PM, shabbychef@gmail.com wrote:

I am writing a relatively large (~6G) hdf5 file using the 1.6.5 library, and
noticing some odd behaviour during read. The odd behaviour is that the *second*
time (and all successive times) I perform an identical read operation, it
proceeds much faster (a factor of 30x). Some details:.

* data is a cube, approximately 4000 x 18000 x 10, floats;
* b/c I cannot keep the cube in memory, I write the file with a 5 passes, each
of size 4000 x 18000 x 2; I am not using extendible datasets, as I know the
size of the cube in advance.
* the file is chunked, but this odd behaviour appears for most of the chunk
sizes I have tried (experimenting with this is slow b/c write times are slow)
* I am reading a hyperslab which is essentially 4000 x 1000 x 10, with the 1000
slices in the 2nd dimension randomly selected from the 18000 available; the
function I wrote to read in hyperslabs does a single hyperslab select per slice
in the 2nd dimension, then an H5Dread.
* If I read a given slab of size 4000 x 1000 x 10, then permute the requested
order of the 1000 2d slices, and perform another read, the 2nd read operation
is also faster than the first, but not as fast as if I make the same read
request twice in a row.

I assume there is something obvious and silly I am doing wrt chunking, chunk
size, caching, cache size, etc, but I am stumped as to why successive reads of
a file affect performance. Does the operation of reading alter the file? I am
closing the file after each read (or I think I am). Does this have anything to
do with the chunk bug? ( http://www.hdfgroup.org/newsletters/bulletin20090302.html )

other possibly irrelevant clues:

* I am setting nan as the fill value, and filling at alloc time (probably not
relevant)
* the target application will usually read i x j x k hyperslabs with contiguous
slabs in the first dimension, but somewhat randomly jumbled slabs in the 2nd
and 3rd dimensions.

--sep

[ Steven E. Pav {bikes/bitters/linux} nerd ]
[ a palindrome: parameter ruts turret em a rap ]

Mark_Moll1 · February 17, 2010, 8:51pm

Based on your signature, I?m assuming you?re using Linux.

this is correct;

Most likely you?re just seeing the effect of RAM being used as a disk cache. That is usually a good thing. If you want a ?cold? start, you need to drop the caches:

sudo echo 3 > /proc/sys/vm/drop_caches

I am not sure if this is intended to make the first pass faster or successive
passes slower; however, after a
sync;echo 3 > /proc/sys/vm/drop_caches
as root, I reran my code and still have the same problem: the first pass is
~600x slower than the second pass.

It’s intended to make the second run slower. It forces the OS to read data from disk rather than use the cached data in RAM. You’d need to drop the caches before each run, but usually you only want to do this if you want to run some benchmark from a “cold” start.

···

On Feb 17, 2010, at 2:20 PM, steven e. pav wrote:

On Wed, 17 Feb 2010, Mark Moll wrote:

recap/more clues:

* when performing a read of 4000 x 1000 x 10, the second read of the same data
is much faster. (alternatively, the first read is much slower than I would
expect). * successive reads of the same hyperslab are as fast as the second; if the
order of the 1000 2d slabs is permuted, the reads are still fast.
* if I attempt another read of a different set of 1000 2d slabs (among the ~18K
available), again the first read is terribly slow, while successive reads of
the same slabs are faster.
* I wrote the hyperslab writer and the hyperslab reader; both are suspect in
this case.

On Feb 16, 2010, at 7:42 PM, shabbychef@gmail.com wrote:

I am writing a relatively large (~6G) hdf5 file using the 1.6.5 library, and
noticing some odd behaviour during read. The odd behaviour is that the *second*
time (and all successive times) I perform an identical read operation, it
proceeds much faster (a factor of 30x). Some details:.

* data is a cube, approximately 4000 x 18000 x 10, floats;
* b/c I cannot keep the cube in memory, I write the file with a 5 passes, each
of size 4000 x 18000 x 2; I am not using extendible datasets, as I know the
size of the cube in advance.
* the file is chunked, but this odd behaviour appears for most of the chunk
sizes I have tried (experimenting with this is slow b/c write times are slow)
* I am reading a hyperslab which is essentially 4000 x 1000 x 10, with the 1000
slices in the 2nd dimension randomly selected from the 18000 available; the
function I wrote to read in hyperslabs does a single hyperslab select per slice
in the 2nd dimension, then an H5Dread.
* If I read a given slab of size 4000 x 1000 x 10, then permute the requested
order of the 1000 2d slices, and perform another read, the 2nd read operation
is also faster than the first, but not as fast as if I make the same read
request twice in a row.

I assume there is something obvious and silly I am doing wrt chunking, chunk
size, caching, cache size, etc, but I am stumped as to why successive reads of
a file affect performance. Does the operation of reading alter the file? I am
closing the file after each read (or I think I am). Does this have anything to
do with the chunk bug? ( http://www.hdfgroup.org/newsletters/bulletin20090302.html )

other possibly irrelevant clues:

* I am setting nan as the fill value, and filling at alloc time (probably not
relevant)
* the target application will usually read i x j x k hyperslabs with contiguous
slabs in the first dimension, but somewhat randomly jumbled slabs in the 2nd
and 3rd dimensions.

--sep

[ Steven E. Pav {bikes/bitters/linux} nerd ]
[ a palindrome: parameter ruts turret em a rap ]

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark

Daniel_Kahn1 · February 17, 2010, 8:47pm

steven e. pav wrote:

Based on your signature, I?m assuming you?re
using Linux.
this is correct;
Most likely you?re just seeing the effect of
RAM being used as a disk cache. That is usually a good thing. If you
want a ?cold? start, you need to drop the caches:
sudo echo 3 > /proc/sys/vm/drop_caches

I think the hypothesis being tested is that the first “read” forces the
data from the disk into the file system cache and the later "read"s
simply grab it from the cache. The intention of the echo 3 ...
is to force a flush of the file system cache each time and make each
read as slow as the first. Thus, I think you need to execute this
command after each read command, a system() call, or
equivalent for your programming language, would be a quick and dirty
way to do this. Is this what you have done?

The file system cache should hold onto the data between executions of a
process for a little while (until the memory is need for something
else) so another test would be to execute the read, terminate
the process and rerun the read as a new process. If the second
run is fast and first slow you are benefiting from the file system
cache. Note that this should be done a relatively “quiet” system (with
no other disk intensive jobs running), and the two processes should run
one right after the other, as soon as possible. Also, you’ll need to
eliminate the “write” for this test. Just write the file as a separate
process before hand.

I am not sure if this is intended to make the first pass
faster or successive

passes slower; however, after a

sync;echo 3 > /proc/sys/vm/drop_caches

as root, I reran my code and still have the same problem: the first
pass is

~600x slower than the second pass.

Wasn’t it only 30x before?

Cheers,

–dan

···

-- Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

steven_e_pav · February 18, 2010, 12:24am

It?s intended to make the second run slower. It forces the OS to read data from disk rather than use the cached data in RAM. You?d need to drop the caches before each run, but usually you only want to do this if you want to run some benchmark from a ?cold? start.

I don't think this is b/c of cached data; more clues:

* if I read a hyperslab that is 4000 x 15 x 10, the slowdown for the first read
is still apparent, but the total time is a bit less than reading 4000 x 1000 x
10 slab (though not linearly, it seems); given the small size of the output in
this case, it seems odd that disk caching would be to blame in this case.
* I am using H5Ssimple_offset when possible (it appears one can only move
forward using offsetting); the slowdown for the first read is not apparent when
reading a 4000 x 15 x 10 hyperslab where the 15 2d slabs are *in sorted order*,
i.e. when the hyperslabs are all constructed via H5Ssimple_offset; * I added simple timers using gettimeofday; the bulk of time spent in the code
is around the H5Dread call, as I suspected.
* oh, by the way, this code is in a Matlab mex, using the hdf5-1.6.5 shared
object library provided by matlab. if this libhdf5.so was somehow boogered by
Mathworks, or if Matlab is somehow doing something weird in the background, I
would not be terribly surprised.

any hints? I cannot always rely on the slabs being read in sorted order. it
seems that H5Dread calls on hyperslab_select'ed dataspaces when moving
'backwards' in the 2d slab order cause the terrible slowdowns. does this sound
familiar to anyone?

recap/more clues:

* when performing a read of 4000 x 1000 x 10, the second read of the same data
is much faster. (alternatively, the first read is much slower than I would
expect). * successive reads of the same hyperslab are as fast as the second; if the
order of the 1000 2d slabs is permuted, the reads are still fast.
* if I attempt another read of a different set of 1000 2d slabs (among the ~18K
available), again the first read is terribly slow, while successive reads of
the same slabs are faster.
* I wrote the hyperslab writer and the hyperslab reader; both are suspect in
this case.

I am writing a relatively large (~6G) hdf5 file using the 1.6.5 library, and
noticing some odd behaviour during read. The odd behaviour is that the *second*
time (and all successive times) I perform an identical read operation, it
proceeds much faster (a factor of 30x). Some details:.

* data is a cube, approximately 4000 x 18000 x 10, floats;
* b/c I cannot keep the cube in memory, I write the file with a 5 passes, each
of size 4000 x 18000 x 2; I am not using extendible datasets, as I know the
size of the cube in advance.
* the file is chunked, but this odd behaviour appears for most of the chunk
sizes I have tried (experimenting with this is slow b/c write times are slow)
* I am reading a hyperslab which is essentially 4000 x 1000 x 10, with the 1000
slices in the 2nd dimension randomly selected from the 18000 available; the
function I wrote to read in hyperslabs does a single hyperslab select per slice
in the 2nd dimension, then an H5Dread.
* If I read a given slab of size 4000 x 1000 x 10, then permute the requested
order of the 1000 2d slices, and perform another read, the 2nd read operation
is also faster than the first, but not as fast as if I make the same read
request twice in a row.

other possibly irrelevant clues:

* I am setting nan as the fill value, and filling at alloc time (probably not
relevant)
* the target application will usually read i x j x k hyperslabs with contiguous
slabs in the first dimension, but somewhat randomly jumbled slabs in the 2nd
and 3rd dimensions.

--sep

[ Steven E. Pav {bikes/bitters/linux} nerd ]
[ a palindrome: oily tubs butyl Io ]

···

On Wed, 17 Feb 2010, Mark Moll wrote:

steven_e_pav · February 18, 2010, 12:45am

I'm sorry, this part indicates a misunderstanding of H5Ssimple_offset on my
part; I had added the simple_offset as an experiment, but am now retreating. going back to just using H5Sselect_hyperslab confirms the problems I am having
with the first read. am digging into the documentation further...

--sep

[ Steven E. Pav {bikes/bitters/linux} nerd ]
[ a palindrome: warily tubs butyl I raw ]

···

On Wed, 17 Feb 2010, steven e. pav wrote:

* I am using H5Ssimple_offset when possible (it appears one can only move
forward using offsetting); the slowdown for the first read is not apparent when
reading a 4000 x 15 x 10 hyperslab where the 15 2d slabs are *in sorted order*,
i.e. when the hyperslabs are all constructed via H5Ssimple_offset; * I added simple timers using gettimeofday; the bulk of time spent in the code
is around the H5Dread call, as I suspected.

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

second pass of read is much faster?