chunk cache

I'm using HDF5 to hold 3- or 4-dim data arrays (which can be several GBytes). The access patterns to the data can vary (even within one application), so the data are stored in a chunked way. Unfortunately control over the chunk cache size seems to be very limited.
I would like to be able to set the cache size each time the access pattern changes. However, as far as I know I can only set the cache size before opening (or creating) the file. It is not even possible to set it per dataset.
For the time being I define the cache size as 16 MBytes, but it is bound to mismatch some access patterns resulting in a great performance loss.

What I would like is the ability to set the cache size per dataset in a dynamic way. Thus not statically before opening the file or dataset, but at any time.
Is it possible to add that to HDF5? Haven't other people felt that need? To me this seems quite fundamental, otherwise you cannot take full advantage of chunking.
What I would like most is that I can tell HDF5 the access pattern (i.e. cursor size and in which order it iterates over the axes) and let it sort out the optimal cache size.

Cheers,
Ger

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Ger,

I'm using HDF5 to hold 3- or 4-dim data arrays (which can be several GBytes). The access patterns to the data can vary (even within one application), so the data are stored in a chunked way. Unfortunately control over the chunk cache size seems to be very limited.
I would like to be able to set the cache size each time the access pattern changes. However, as far as I know I can only set the cache size before opening (or creating) the file. It is not even possible to set it per dataset.

  Yes, that's currently true. We're in the process of revising the chunk cache itself and I'm guessing that we'll want to revise the API that controls it also.

For the time being I define the cache size as 16 MBytes, but it is bound to mismatch some access patterns resulting in a great performance loss.

What I would like is the ability to set the cache size per dataset in a dynamic way. Thus not statically before opening the file or dataset, but at any time.
Is it possible to add that to HDF5? Haven't other people felt that need? To me this seems quite fundamental, otherwise you cannot take full advantage of chunking.
What I would like most is that I can tell HDF5 the access pattern (i.e. cursor size and in which order it iterates over the axes) and let it sort out the optimal cache size.

  Hmm, I hadn't thought about that sort of replacement algorithm for evicting the chunks in the cache - I was planning on implementing an LRU algorithm with a fixed-size cache. I would be worried about allowing an application to potentially pick a "bad" order over the axes and end up with a very large numbers of chunks cached.

  I'm very interested in hearing any ideas you (or others) might have for how you think the eviction algorithm could work and the API needed to control it.

  Quincey

···

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Quincey,

It should be clear that this discussion is for very large arrays not fitting in memory. Otherwise you may as well read the entire array and do operations in memory.

IMHO HDF5 cannot prescribe access patterns; it can advise them though. Furthermore it could define or let the user set a maximum cache size (which could be really large, potentially several GB on machines with a lot of memory.

The best cursor shape is the chunk shape. So if an application does not care about order (e.g. to determine the min/max of an array), it should use that one.
We have applications where we need to determine the median for each vector or plane in a cube. This can be a vector or plane in any direction, so my cursor shape can be (nx,1,1) or (1,ny,1), etc. It would be nice if I could write such a loop as (in C++ terms):
   DataSetIterator iter(cursorshape);
   while (iter.next()) {
      iter.data() gives pointer to data
      iter.position() gives blc of current cursor position
   }
In this way HDF5 can execute the iteration in the optimal order (and set the cache size for me) without the user having to worry about it.
It would also be nice if the chunk cache keeps statistics which I can display on demand (showing the nr of reads, writes, cache hits) to see if the cache behaves as expected.

Another option would be to mmap the dataset, so the OS will do the caching for you. Of course, only on systems where it is possible. Probably it makes life much easier for you.

Cheers,
Ger

Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>

Hi Ger,

I'm using HDF5 to hold 3- or 4-dim data arrays (which can be several
GBytes). The access patterns to the data can vary (even within one
application), so the data are stored in a chunked way. Unfortunately
control over the chunk cache size seems to be very limited.
I would like to be able to set the cache size each time the access
pattern changes. However, as far as I know I can only set the cache
size before opening (or creating) the file. It is not even possible
to set it per dataset.

  Yes, that's currently true. We're in the process of revising the
chunk cache itself and I'm guessing that we'll want to revise the API
that controls it also.

For the time being I define the cache size as 16 MBytes, but it is
bound to mismatch some access patterns resulting in a great
performance loss.

What I would like is the ability to set the cache size per dataset
in a dynamic way. Thus not statically before opening the file or
dataset, but at any time.
Is it possible to add that to HDF5? Haven't other people felt that
need? To me this seems quite fundamental, otherwise you cannot take
full advantage of chunking.
What I would like most is that I can tell HDF5 the access pattern
(i.e. cursor size and in which order it iterates over the axes) and
let it sort out the optimal cache size.

  Hmm, I hadn't thought about that sort of replacement algorithm for
evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

  I'm very interested in hearing any ideas you (or others) might have
for how you think the eviction algorithm could work and the API needed
to control it.

  Quincey

···

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

A Tuesday 18 March 2008, Quincey Koziol escrigué:

Hi Ger,

> I'm using HDF5 to hold 3- or 4-dim data arrays (which can be
> several GBytes). The access patterns to the data can vary (even
> within one application), so the data are stored in a chunked way.
> Unfortunately control over the chunk cache size seems to be very
> limited. I would like to be able to set the cache size each time
> the access pattern changes. However, as far as I know I can only
> set the cache size before opening (or creating) the file. It is not
> even possible to set it per dataset.

  Yes, that's currently true. We're in the process of revising the
chunk cache itself and I'm guessing that we'll want to revise the API
that controls it also.

> For the time being I define the cache size as 16 MBytes, but it is
> bound to mismatch some access patterns resulting in a great
> performance loss.
>
> What I would like is the ability to set the cache size per dataset
> in a dynamic way. Thus not statically before opening the file or
> dataset, but at any time.
> Is it possible to add that to HDF5? Haven't other people felt that
> need? To me this seems quite fundamental, otherwise you cannot take
> full advantage of chunking.
> What I would like most is that I can tell HDF5 the access pattern
> (i.e. cursor size and in which order it iterates over the axes) and
> let it sort out the optimal cache size.

  Hmm, I hadn't thought about that sort of replacement algorithm for
evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

  I'm very interested in hearing any ideas you (or others) might have
for how you think the eviction algorithm could work and the API
needed to control it.

Well, I thought that the replacement algorithm for the chunks in HDF5
cache was already an LRU one. I'd like to share my experiences about
LRU implementations, but it would be nice to know first what you are
after. Could you please send a pointer where the current evition
algorithm is explained? I've quickly tried with a recent manual of
HDF5 1.8.0, but only found things about the metadata cache, not the
chunk cache.

Thanks,

···

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

--

0,0< Francesc Altet http://www.carabos.com/

V V Cárabos Coop. V. Enjoy Data
"-"

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Ger,

Hi Quincey,

It should be clear that this discussion is for very large arrays not fitting in memory. Otherwise you may as well read the entire array and do operations in memory.

IMHO HDF5 cannot prescribe access patterns; it can advise them though. Furthermore it could define or let the user set a maximum cache size (which could be really large, potentially several GB on machines with a lot of memory.

  Yes, I agree with both of these things (not prescribe access patterns and allowing the user to adjust the cache size more dynamically)

The best cursor shape is the chunk shape. So if an application does not care about order (e.g. to determine the min/max of an array), it should use that one.
We have applications where we need to determine the median for each vector or plane in a cube. This can be a vector or plane in any direction, so my cursor shape can be (nx,1,1) or (1,ny,1), etc. It would be nice if I could write such a loop as (in C++ terms):
  DataSetIterator iter(cursorshape);
  while (iter.next()) {
     iter.data() gives pointer to data
     iter.position() gives blc of current cursor position
  }
In this way HDF5 can execute the iteration in the optimal order (and set the cache size for me) without the user having to worry about it.

  Yes, I've heard this a few times from different users and it's on my "wish list". :slight_smile:

It would also be nice if the chunk cache keeps statistics which I can display on demand (showing the nr of reads, writes, cache hits) to see if the cache behaves as expected.

  That's not difficult and sounds really useful, I'll try to incorporate it into the current round of changes I'm making.

Another option would be to mmap the dataset, so the OS will do the caching for you. Of course, only on systems where it is possible. Probably it makes life much easier for you.

  We've done a small amount of testing on using mmap() and (at least for the use cases we were testing), it wasn't really a win. Our internal caching "knows" more about the access pattern and seems to do a better job than the OS can.

  Quincey

···

On Mar 20, 2008, at 3:17 AM, Ger van Diepen wrote:

Cheers,
Ger

Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>

Hi Ger,

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

I'm using HDF5 to hold 3- or 4-dim data arrays (which can be several
GBytes). The access patterns to the data can vary (even within one
application), so the data are stored in a chunked way. Unfortunately
control over the chunk cache size seems to be very limited.
I would like to be able to set the cache size each time the access
pattern changes. However, as far as I know I can only set the cache
size before opening (or creating) the file. It is not even possible
to set it per dataset.

  Yes, that's currently true. We're in the process of revising the
chunk cache itself and I'm guessing that we'll want to revise the API
that controls it also.

For the time being I define the cache size as 16 MBytes, but it is
bound to mismatch some access patterns resulting in a great
performance loss.

What I would like is the ability to set the cache size per dataset
in a dynamic way. Thus not statically before opening the file or
dataset, but at any time.
Is it possible to add that to HDF5? Haven't other people felt that
need? To me this seems quite fundamental, otherwise you cannot take
full advantage of chunking.
What I would like most is that I can tell HDF5 the access pattern
(i.e. cursor size and in which order it iterates over the axes) and
let it sort out the optimal cache size.

  Hmm, I hadn't thought about that sort of replacement algorithm for
evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

  I'm very interested in hearing any ideas you (or others) might have
for how you think the eviction algorithm could work and the API needed
to control it.

  Quincey

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Ger,

this is a very interesting issue indeed and I have similar datasets as well.
However, there are some things I am not sure I understand completely:

1) As far as I can tell, chunk size relates more to optimal read/write
strategy for the dataset. This means that if the dataset is resizable, I
vary my chunk size according to use cases (or access patterns) like: (a)
write once a fixed dataset and read frequently (b) write once a variable
dataset and read frequently (c) update/resize a lot and read seldom or
rarely and further refinements, depending on whether each action takes place
in a local or remote machine, whether resizing takes place and whether new
size can be foreseen, if there is a different leading dimension when reading
and writing and so on. But chunking relates to the allocated blocks in the
disk AFAIK. For a resizable dataset it is clear that this is set at creation
and cannot be changed, as chunks may be scattered within the file. If you
change the access pattern at that point and want to change the chunk size,
wouldn't you have to rewrite the whole dataset?

2) H5Pset_buffer can be used with the iterator you described, in order to
accommodate the slab selection (or any multitude). Of course this is an
application dependent strategy and IMHO it is better that it is not
predefined by the library. Therefore I write my selection iterators at a
higher level, and can adjust my strategy depending on the urgency of the
task.

3) If I recall correctly, sec2 driver does use memory mapping anyway.

Am I missing something?

BR

-- dimitris

···

2008/3/20, Ger van Diepen <diepen@astron.nl>:

Hi Quincey,

It should be clear that this discussion is for very large arrays not
fitting in memory. Otherwise you may as well read the entire array and do
operations in memory.

IMHO HDF5 cannot prescribe access patterns; it can advise them though.
Furthermore it could define or let the user set a maximum cache size (which
could be really large, potentially several GB on machines with a lot of
memory.

The best cursor shape is the chunk shape. So if an application does not
care about order (e.g. to determine the min/max of an array), it should
use that one.
We have applications where we need to determine the median for each vector
or plane in a cube. This can be a vector or plane in any direction, so my
cursor shape can be (nx,1,1) or (1,ny,1), etc. It would be nice if I could
write such a loop as (in C++ terms):
   DataSetIterator iter(cursorshape);
   while (iter.next()) {
      iter.data() gives pointer to data
      iter.position() gives blc of current cursor position
   }
In this way HDF5 can execute the iteration in the optimal order (and set
the cache size for me) without the user having to worry about it.
It would also be nice if the chunk cache keeps statistics which I can
display on demand (showing the nr of reads, writes, cache hits) to see if
the cache behaves as expected.

Another option would be to mmap the dataset, so the OS will do the caching
for you. Of course, only on systems where it is possible. Probably it makes
life much easier for you.

Cheers,
Ger

>>> Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>

Hi Ger,

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

> I'm using HDF5 to hold 3- or 4-dim data arrays (which can be several
> GBytes). The access patterns to the data can vary (even within one
> application), so the data are stored in a chunked way. Unfortunately
> control over the chunk cache size seems to be very limited.
> I would like to be able to set the cache size each time the access
> pattern changes. However, as far as I know I can only set the cache
> size before opening (or creating) the file. It is not even possible
> to set it per dataset.

        Yes, that's currently true. We're in the process of revising the
chunk cache itself and I'm guessing that we'll want to revise the API
that controls it also.

> For the time being I define the cache size as 16 MBytes, but it is
> bound to mismatch some access patterns resulting in a great
> performance loss.
>
> What I would like is the ability to set the cache size per dataset
> in a dynamic way. Thus not statically before opening the file or
> dataset, but at any time.
> Is it possible to add that to HDF5? Haven't other people felt that
> need? To me this seems quite fundamental, otherwise you cannot take
> full advantage of chunking.
> What I would like most is that I can tell HDF5 the access pattern
> (i.e. cursor size and in which order it iterates over the axes) and
> let it sort out the optimal cache size.

        Hmm, I hadn't thought about that sort of replacement algorithm for
evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

        I'm very interested in hearing any ideas you (or others) might
have
for how you think the eviction algorithm could work and the API needed
to control it.

        Quincey

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
- What is the difference between mechanical engineers and civil engineers?
Mechanical engineers build weapons civil engineers build targets.
- Good health is merely the slowest possible rate at which one can die.
- Life is like a jar of jalapeño peppers. What you do today, might Burn Your
Butt Tomorrow

Hi Francesc,

A Tuesday 18 March 2008, Quincey Koziol escrigué:

Hi Ger,

I'm using HDF5 to hold 3- or 4-dim data arrays (which can be
several GBytes). The access patterns to the data can vary (even
within one application), so the data are stored in a chunked way.
Unfortunately control over the chunk cache size seems to be very
limited. I would like to be able to set the cache size each time
the access pattern changes. However, as far as I know I can only
set the cache size before opening (or creating) the file. It is not
even possible to set it per dataset.

  Yes, that's currently true. We're in the process of revising the
chunk cache itself and I'm guessing that we'll want to revise the API
that controls it also.

For the time being I define the cache size as 16 MBytes, but it is
bound to mismatch some access patterns resulting in a great
performance loss.

What I would like is the ability to set the cache size per dataset
in a dynamic way. Thus not statically before opening the file or
dataset, but at any time.
Is it possible to add that to HDF5? Haven't other people felt that
need? To me this seems quite fundamental, otherwise you cannot take
full advantage of chunking.
What I would like most is that I can tell HDF5 the access pattern
(i.e. cursor size and in which order it iterates over the axes) and
let it sort out the optimal cache size.

  Hmm, I hadn't thought about that sort of replacement algorithm for
evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

  I'm very interested in hearing any ideas you (or others) might have
for how you think the eviction algorithm could work and the API
needed to control it.

Well, I thought that the replacement algorithm for the chunks in HDF5
cache was already an LRU one. I'd like to share my experiences about
LRU implementations, but it would be nice to know first what you are
after. Could you please send a pointer where the current evition
algorithm is explained? I've quickly tried with a recent manual of
HDF5 1.8.0, but only found things about the metadata cache, not the
chunk cache.

  Hmm, I don't think I have a good document describing the chunk cache eviction algorithm. :-/ In short, it's basically an LRU scheme, but it also tries to hold "partially accessed" chunks longer than chunks which have been "fully accessed" (all their elements read/written). I'm not certain the additional complexity for this partial access idea has a lot of value, so I'm planning to shift to a simpler LRU scheme.

  Quincey

···

On Mar 20, 2008, at 3:07 PM, Francesc Altet wrote:

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

A Thursday 20 March 2008, escriguéreu:

> Well, I thought that the replacement algorithm for the chunks in
> HDF5 cache was already an LRU one. I'd like to share my
> experiences about LRU implementations, but it would be nice to know
> first what you are after. Could you please send a pointer where
> the current evition algorithm is explained? I've quickly tried
> with a recent manual of HDF5 1.8.0, but only found things about the
> metadata cache, not the chunk cache.

  Hmm, I don't think I have a good document describing the chunk cache
eviction algorithm. :-/ In short, it's basically an LRU scheme, but
it also tries to hold "partially accessed" chunks longer than chunks
which have been "fully accessed" (all their elements read/written).
I'm not certain the additional complexity for this partial access
idea has a lot of value, so I'm planning to shift to a simpler LRU
scheme.

OK I see. Well, my experience is that, with an appropriate
implementation (see below), the additional complexity for evicting
entries is not very important for global performance. I'm going to
describe the way we have implemented the LRU cache for PyTables Pro
purposes so that you can judge if you want to implement some of its
features (also, you can see some benchmark figures in my previous
message to Ger).

Our LRU implementation has two principal features:

1. It uses a hash in order to search slots in the cache. This allows
for very fast slot lookups (and also insertions).

2. It provides machinery for autosensing its efficiency (hits/queries
ratio). If after a certain number of queries the efficiency is lower
than a certain threshold (0.6 currently), the cache refuses to enter
more slots. After some time has passed (50*cache_slots queries), the
cache is re-activated again and a new cycle starts.

Feature 1 allows very fast lookups, which is specially important when
the efficiency of the cache is low. Feature 2 allows you to implement
relatively complex LRU algorithms or to perform somewhat expensive data
serialization before being filed in the cache, without worrying too
much about the loss of performance caused by a low hist/queries ratio.
There are other small optimizations in the PyTables Pro implementation,
but the two mentioned above are the most important ones.

BTW, I've tried to measure the performance of our LRU against the one
which is supposed to be in HDF5, but during my experiments, I was
unable to detect any measurable speed-up in HDF5. Moreover, I've tried
to change and/or deactivate the chunk cache in HDF5 with the next
calls:

  dcpl = H5Fget_access_plist(file_id);
  H5Pset_cache(dcpl, 0, 128, 1024*1024, 0.0); /* 1 MB cache */
/* H5Pset_cache(dcpl, 0, 512, 4*1024*1024, 0.0); */ /* 4 MB cache */
/* H5Pset_cache(dcpl, 0, 0, 0, 0.0); */ /* deactivate cache */
  H5Pclose(dcpl);

after file opening, with no visible effect at all (while our LRU
implementation was being able to accelerate access up to 10x faster).
What I'm doing wrong?

Thanks,

···

--

0,0< Francesc Altet http://www.carabos.com/

V V Cárabos Coop. V. Enjoy Data
"-"

Hi Dimitris,

Hi Ger,

this is a very interesting issue indeed and I have similar datasets as well. However, there are some things I am not sure I understand completely:

1) As far as I can tell, chunk size relates more to optimal read/write strategy for the dataset. This means that if the dataset is resizable, I vary my chunk size according to use cases (or access patterns) like: (a) write once a fixed dataset and read frequently (b) write once a variable dataset and read frequently (c) update/resize a lot and read seldom or rarely and further refinements, depending on whether each action takes place in a local or remote machine, whether resizing takes place and whether new size can be foreseen, if there is a different leading dimension when reading and writing and so on. But chunking relates to the allocated blocks in the disk AFAIK. For a resizable dataset it is clear that this is set at creation and cannot be changed, as chunks may be scattered within the file. If you change the access pattern at that point and want to change the chunk size, wouldn't you have to rewrite the whole dataset?

2) H5Pset_buffer can be used with the iterator you described, in order to accommodate the slab selection (or any multitude). Of course this is an application dependent strategy and IMHO it is better that it is not predefined by the library. Therefore I write my selection iterators at a higher level, and can adjust my strategy depending on the urgency of the task.

3) If I recall correctly, sec2 driver does use memory mapping anyway.

  The sec2 driver does not use memory mapping currently.

    Quincey

···

On Mar 20, 2008, at 6:10 AM, Dimitris Servis wrote:

Am I missing something?

BR

-- dimitris

2008/3/20, Ger van Diepen <diepen@astron.nl>: Hi Quincey,

It should be clear that this discussion is for very large arrays not fitting in memory. Otherwise you may as well read the entire array and do operations in memory.

IMHO HDF5 cannot prescribe access patterns; it can advise them though. Furthermore it could define or let the user set a maximum cache size (which could be really large, potentially several GB on machines with a lot of memory.

The best cursor shape is the chunk shape. So if an application does not care about order (e.g. to determine the min/max of an array), it should use that one.
We have applications where we need to determine the median for each vector or plane in a cube. This can be a vector or plane in any direction, so my cursor shape can be (nx,1,1) or (1,ny,1), etc. It would be nice if I could write such a loop as (in C++ terms):
   DataSetIterator iter(cursorshape);
   while (iter.next()) {
      iter.data() gives pointer to data
      iter.position() gives blc of current cursor position
   }
In this way HDF5 can execute the iteration in the optimal order (and set the cache size for me) without the user having to worry about it.
It would also be nice if the chunk cache keeps statistics which I can display on demand (showing the nr of reads, writes, cache hits) to see if the cache behaves as expected.

Another option would be to mmap the dataset, so the OS will do the caching for you. Of course, only on systems where it is possible. Probably it makes life much easier for you.

Cheers,
Ger

>>> Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>

Hi Ger,

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

> I'm using HDF5 to hold 3- or 4-dim data arrays (which can be several
> GBytes). The access patterns to the data can vary (even within one
> application), so the data are stored in a chunked way. Unfortunately
> control over the chunk cache size seems to be very limited.
> I would like to be able to set the cache size each time the access
> pattern changes. However, as far as I know I can only set the cache
> size before opening (or creating) the file. It is not even possible
> to set it per dataset.

        Yes, that's currently true. We're in the process of revising the
chunk cache itself and I'm guessing that we'll want to revise the API
that controls it also.

> For the time being I define the cache size as 16 MBytes, but it is
> bound to mismatch some access patterns resulting in a great
> performance loss.
>
> What I would like is the ability to set the cache size per dataset
> in a dynamic way. Thus not statically before opening the file or
> dataset, but at any time.
> Is it possible to add that to HDF5? Haven't other people felt that
> need? To me this seems quite fundamental, otherwise you cannot take
> full advantage of chunking.
> What I would like most is that I can tell HDF5 the access pattern
> (i.e. cursor size and in which order it iterates over the axes) and
> let it sort out the optimal cache size.

        Hmm, I hadn't thought about that sort of replacement algorithm for
evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

        I'm very interested in hearing any ideas you (or others) might have
for how you think the eviction algorithm could work and the API needed
to control it.

        Quincey

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
- What is the difference between mechanical engineers and civil engineers?
Mechanical engineers build weapons civil engineers build targets.
- Good health is merely the slowest possible rate at which one can die.
- Life is like a jar of jalapeño peppers. What you do today, might Burn Your Butt Tomorrow

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Dimitris,

I do not want to change the chunk size; that is fixed once you've
created a dataset.
What I do want to be able to define dynamically is the chunk CACHE
size.

Suppose you have a dataset array z,y,x with shape [1000,1000,1000] and
chunk size [10,10,100]
- when stepping through the array with a cursor shape of [10,10,100]
the cache needs to contain one chunk only as it matchs the chunk shape.
- when stepping through the array with cursor [1,1,1000], the cache
should contain 10 chunks. This is because you need 10 chunks to get the
full vector and you want those chunks to be kept in memory. However,
this is only true if you step through the array in optimal order (thus
10 steps in y, thereafter 10 in z, etc.). When stepping naievely like:
   for (iz=0; iz<1000; ++iz)
     for (iy=0; iy<1000; ++iy)
       read hyperslab of shape [1,1,1000] at position [iz,iy,1000]
the cache should be 100 times bigger or you have to accept that the
same chunk gets read 10 times.

I wonder how current HDF5 users do these kind of things? Are their
arrays so small that they can always be held in memory? Or do they
accept that the cache is too small and is thrashing?

Probably this kind of iteration functionality should be put in a layer
on top of HDF5. But HDF5 should provide the means to set the chunk cache
size dynamically.

I don't know if sec2 using mmap. I thought it uses unbuffered IO only.

Cheers,
Ger

"Dimitris Servis" <servisster@gmail.com> 03/20/08 12:10 PM >>>

Hi Ger,

this is a very interesting issue indeed and I have similar datasets as
well.
However, there are some things I am not sure I understand completely:

1) As far as I can tell, chunk size relates more to optimal read/write
strategy for the dataset. This means that if the dataset is resizable,
I
vary my chunk size according to use cases (or access patterns) like:
(a)
write once a fixed dataset and read frequently (b) write once a
variable
dataset and read frequently (c) update/resize a lot and read seldom or
rarely and further refinements, depending on whether each action takes
place
in a local or remote machine, whether resizing takes place and
whether new
size can be foreseen, if there is a different leading dimension when
reading
and writing and so on. But chunking relates to the allocated blocks in
the
disk AFAIK. For a resizable dataset it is clear that this is set at
creation
and cannot be changed, as chunks may be scattered within the file. If
you
change the access pattern at that point and want to change the chunk
size,
wouldn't you have to rewrite the whole dataset?

2) H5Pset_buffer can be used with the iterator you described, in order
to
accommodate the slab selection (or any multitude). Of course this is
an
application dependent strategy and IMHO it is better that it is not
predefined by the library. Therefore I write my selection iterators at
a
higher level, and can adjust my strategy depending on the urgency of
the
task.

3) If I recall correctly, sec2 driver does use memory mapping anyway.

Am I missing something?

BR

-- dimitris

Hi Quincey,

It should be clear that this discussion is for very large arrays not
fitting in memory. Otherwise you may as well read the entire array

and do

operations in memory.

IMHO HDF5 cannot prescribe access patterns; it can advise them

though.

Furthermore it could define or let the user set a maximum cache size

(which

could be really large, potentially several GB on machines with a lot

of

memory.

The best cursor shape is the chunk shape. So if an application does

not

care about order (e.g. to determine the min/max of an array), it

should

use that one.
We have applications where we need to determine the median for each

vector

or plane in a cube. This can be a vector or plane in any direction,

so my

cursor shape can be (nx,1,1) or (1,ny,1), etc. It would be nice if I

could

write such a loop as (in C++ terms):
   DataSetIterator iter(cursorshape);
   while (iter.next()) {
      iter.data() gives pointer to data
      iter.position() gives blc of current cursor position
   }
In this way HDF5 can execute the iteration in the optimal order (and

set

the cache size for me) without the user having to worry about it.
It would also be nice if the chunk cache keeps statistics which I

can

display on demand (showing the nr of reads, writes, cache hits) to

see if

the cache behaves as expected.

Another option would be to mmap the dataset, so the OS will do the

caching

for you. Of course, only on systems where it is possible. Probably it

makes

life much easier for you.

Cheers,
Ger

>>> Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>

Hi Ger,

> I'm using HDF5 to hold 3- or 4-dim data arrays (which can be

several

> GBytes). The access patterns to the data can vary (even within one
> application), so the data are stored in a chunked way.

Unfortunately

> control over the chunk cache size seems to be very limited.
> I would like to be able to set the cache size each time the access
> pattern changes. However, as far as I know I can only set the

cache

> size before opening (or creating) the file. It is not even

possible

> to set it per dataset.

        Yes, that's currently true. We're in the process of revising

the

chunk cache itself and I'm guessing that we'll want to revise the

API

that controls it also.

> For the time being I define the cache size as 16 MBytes, but it is
> bound to mismatch some access patterns resulting in a great
> performance loss.
>
> What I would like is the ability to set the cache size per dataset
> in a dynamic way. Thus not statically before opening the file or
> dataset, but at any time.
> Is it possible to add that to HDF5? Haven't other people felt that
> need? To me this seems quite fundamental, otherwise you cannot

take

> full advantage of chunking.
> What I would like most is that I can tell HDF5 the access pattern
> (i.e. cursor size and in which order it iterates over the axes)

and

> let it sort out the optimal cache size.

        Hmm, I hadn't thought about that sort of replacement

algorithm for

evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

        I'm very interested in hearing any ideas you (or others)

might

have
for how you think the eviction algorithm could work and the API

needed

···

2008/3/20, Ger van Diepen <diepen@astron.nl>:

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:
to control it.

        Quincey

----------------------------------------------------------------------

This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to

hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------

This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to

hdf-forum-unsubscribe@hdfgroup.org.

--
- What is the difference between mechanical engineers and civil
engineers?
Mechanical engineers build weapons civil engineers build targets.
- Good health is merely the slowest possible rate at which one can
die.
- Life is like a jar of jalapeño peppers. What you do today, might Burn
Your
Butt Tomorrow

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Quincey,

sorry, wrong impression :slight_smile:

-- dimitris

···

2008/3/20, Quincey Koziol <koziol@hdfgroup.org>:

Hi Dimitris,

On Mar 20, 2008, at 6:10 AM, Dimitris Servis wrote:

> Hi Ger,
>
> this is a very interesting issue indeed and I have similar datasets
> as well. However, there are some things I am not sure I understand
> completely:
>
> 1) As far as I can tell, chunk size relates more to optimal read/
> write strategy for the dataset. This means that if the dataset is
> resizable, I vary my chunk size according to use cases (or access
> patterns) like: (a) write once a fixed dataset and read frequently
> (b) write once a variable dataset and read frequently (c) update/
> resize a lot and read seldom or rarely and further refinements,
> depending on whether each action takes place in a local or remote
> machine, whether resizing takes place and whether new size can be
> foreseen, if there is a different leading dimension when reading and
> writing and so on. But chunking relates to the allocated blocks in
> the disk AFAIK. For a resizable dataset it is clear that this is set
> at creation and cannot be changed, as chunks may be scattered within
> the file. If you change the access pattern at that point and want to
> change the chunk size, wouldn't you have to rewrite the whole dataset?
>
> 2) H5Pset_buffer can be used with the iterator you described, in
> order to accommodate the slab selection (or any multitude). Of
> course this is an application dependent strategy and IMHO it is
> better that it is not predefined by the library. Therefore I write
> my selection iterators at a higher level, and can adjust my strategy
> depending on the urgency of the task.
>
> 3) If I recall correctly, sec2 driver does use memory mapping anyway.

        The sec2 driver does not use memory mapping currently.

                Quincey

>
>
> Am I missing something?
>
> BR
>
> -- dimitris
>
> 2008/3/20, Ger van Diepen <diepen@astron.nl>: Hi Quincey,
>
> It should be clear that this discussion is for very large arrays not
> fitting in memory. Otherwise you may as well read the entire array
> and do operations in memory.
>
> IMHO HDF5 cannot prescribe access patterns; it can advise them
> though. Furthermore it could define or let the user set a maximum
> cache size (which could be really large, potentially several GB on
> machines with a lot of memory.
>
> The best cursor shape is the chunk shape. So if an application does
> not care about order (e.g. to determine the min/max of an array), it
> should use that one.
> We have applications where we need to determine the median for each
> vector or plane in a cube. This can be a vector or plane in any
> direction, so my cursor shape can be (nx,1,1) or (1,ny,1), etc. It
> would be nice if I could write such a loop as (in C++ terms):
> DataSetIterator iter(cursorshape);
> while (iter.next()) {
> iter.data() gives pointer to data
> iter.position() gives blc of current cursor position
> }
> In this way HDF5 can execute the iteration in the optimal order (and
> set the cache size for me) without the user having to worry about it.
> It would also be nice if the chunk cache keeps statistics which I
> can display on demand (showing the nr of reads, writes, cache hits)
> to see if the cache behaves as expected.
>
> Another option would be to mmap the dataset, so the OS will do the
> caching for you. Of course, only on systems where it is possible.
> Probably it makes life much easier for you.
>
> Cheers,
> Ger
>
> >>> Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>
>
> Hi Ger,
>
> On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:
>
> > I'm using HDF5 to hold 3- or 4-dim data arrays (which can be several
> > GBytes). The access patterns to the data can vary (even within one
> > application), so the data are stored in a chunked way. Unfortunately
> > control over the chunk cache size seems to be very limited.
> > I would like to be able to set the cache size each time the access
> > pattern changes. However, as far as I know I can only set the cache
> > size before opening (or creating) the file. It is not even possible
> > to set it per dataset.
>
> Yes, that's currently true. We're in the process of
> revising the
> chunk cache itself and I'm guessing that we'll want to revise the API
> that controls it also.
>
> > For the time being I define the cache size as 16 MBytes, but it is
> > bound to mismatch some access patterns resulting in a great
> > performance loss.
> >
> > What I would like is the ability to set the cache size per dataset
> > in a dynamic way. Thus not statically before opening the file or
> > dataset, but at any time.
> > Is it possible to add that to HDF5? Haven't other people felt that
> > need? To me this seems quite fundamental, otherwise you cannot take
> > full advantage of chunking.
> > What I would like most is that I can tell HDF5 the access pattern
> > (i.e. cursor size and in which order it iterates over the axes) and
> > let it sort out the optimal cache size.
>
> Hmm, I hadn't thought about that sort of replacement
> algorithm for
> evicting the chunks in the cache - I was planning on implementing an
> LRU algorithm with a fixed-size cache. I would be worried about
> allowing an application to potentially pick a "bad" order over the
> axes and end up with a very large numbers of chunks cached.
>
> I'm very interested in hearing any ideas you (or others)
> might have
> for how you think the eviction algorithm could work and the API needed
> to control it.
>
> Quincey
>
>
> ----------------------------------------------------------------------
> This mailing list is for HDF software users discussion.
> To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org
> .
> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.
>
>
>
> ----------------------------------------------------------------------
> This mailing list is for HDF software users discussion.
> To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org
> .
> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.
>
>
>
>
> --
> - What is the difference between mechanical engineers and civil
> engineers?
> Mechanical engineers build weapons civil engineers build targets.
> - Good health is merely the slowest possible rate at which one can
> die.
> - Life is like a jar of jalapeño peppers. What you do today, might
> Burn Your Butt Tomorrow

--
- What is the difference between mechanical engineers and civil engineers?
Mechanical engineers build weapons civil engineers build targets.
- Good health is merely the slowest possible rate at which one can die.
- Life is like a jar of jalapeño peppers. What you do today, might Burn Your
Butt Tomorrow

Hi Francesc,

···

On Mar 22, 2008, at 12:14 PM, Francesc Altet wrote:

A Thursday 20 March 2008, escriguéreu:

Well, I thought that the replacement algorithm for the chunks in
HDF5 cache was already an LRU one. I'd like to share my
experiences about LRU implementations, but it would be nice to know
first what you are after. Could you please send a pointer where
the current evition algorithm is explained? I've quickly tried
with a recent manual of HDF5 1.8.0, but only found things about the
metadata cache, not the chunk cache.

  Hmm, I don't think I have a good document describing the chunk cache
eviction algorithm. :-/ In short, it's basically an LRU scheme, but
it also tries to hold "partially accessed" chunks longer than chunks
which have been "fully accessed" (all their elements read/written).
I'm not certain the additional complexity for this partial access
idea has a lot of value, so I'm planning to shift to a simpler LRU
scheme.

OK I see. Well, my experience is that, with an appropriate
implementation (see below), the additional complexity for evicting
entries is not very important for global performance. I'm going to
describe the way we have implemented the LRU cache for PyTables Pro
purposes so that you can judge if you want to implement some of its
features (also, you can see some benchmark figures in my previous
message to Ger).

Our LRU implementation has two principal features:

1. It uses a hash in order to search slots in the cache. This allows
for very fast slot lookups (and also insertions).

2. It provides machinery for autosensing its efficiency (hits/queries
ratio). If after a certain number of queries the efficiency is lower
than a certain threshold (0.6 currently), the cache refuses to enter
more slots. After some time has passed (50*cache_slots queries), the
cache is re-activated again and a new cycle starts.

Feature 1 allows very fast lookups, which is specially important when
the efficiency of the cache is low. Feature 2 allows you to implement
relatively complex LRU algorithms or to perform somewhat expensive data
serialization before being filed in the cache, without worrying too
much about the loss of performance caused by a low hist/queries ratio.
There are other small optimizations in the PyTables Pro implementation,
but the two mentioned above are the most important ones.

BTW, I've tried to measure the performance of our LRU against the one
which is supposed to be in HDF5, but during my experiments, I was
unable to detect any measurable speed-up in HDF5. Moreover, I've tried
to change and/or deactivate the chunk cache in HDF5 with the next
calls:

dcpl = H5Fget_access_plist(file_id);
H5Pset_cache(dcpl, 0, 128, 1024*1024, 0.0); /* 1 MB cache */
/* H5Pset_cache(dcpl, 0, 512, 4*1024*1024, 0.0); */ /* 4 MB cache */
/* H5Pset_cache(dcpl, 0, 0, 0, 0.0); */ /* deactivate cache */
H5Pclose(dcpl);

after file opening, with no visible effect at all (while our LRU
implementation was being able to accelerate access up to 10x faster).
What I'm doing wrong?

  The H5Pset_cache() routine works on a file access property list (not a dataset creation property list, as you seem to be indicating through your variable naming). It's also not dynamically adjustable while the file is open, so you'll need to close all the IDs for the file and reopen it with different parameters for this sort of experiment.

  Quincey

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Ger,

Hi Dimitris,

I do not want to change the chunk size; that is fixed once you've
created a dataset.
What I do want to be able to define dynamically is the chunk CACHE
size.

Suppose you have a dataset array z,y,x with shape [1000,1000,1000] and
chunk size [10,10,100]
- when stepping through the array with a cursor shape of [10,10,100]
the cache needs to contain one chunk only as it matchs the chunk shape.
- when stepping through the array with cursor [1,1,1000], the cache
should contain 10 chunks. This is because you need 10 chunks to get the
full vector and you want those chunks to be kept in memory. However,
this is only true if you step through the array in optimal order (thus
10 steps in y, thereafter 10 in z, etc.). When stepping naievely like:
  for (iz=0; iz<1000; ++iz)
    for (iy=0; iy<1000; ++iy)
      read hyperslab of shape [1,1,1000] at position [iz,iy,1000]
the cache should be 100 times bigger or you have to accept that the
same chunk gets read 10 times.

  Hmm, that's an interesting idea - to track sequential I/O accesses to a dataset and adjust the chunk cache size according to any pattern detected. I'll add that idea to the wish list also... :slight_smile:

I wonder how current HDF5 users do these kind of things? Are their
arrays so small that they can always be held in memory? Or do they
accept that the cache is too small and is thrashing?

  More people increase the chunk size enough to avoid thrashing, but it's not terribly optimal currently, since it's at the file level - I'll be addressing this in my current changes.

  Quincey

···

On Mar 20, 2008, at 6:52 AM, Ger van Diepen wrote:

Probably this kind of iteration functionality should be put in a layer
on top of HDF5. But HDF5 should provide the means to set the chunk cache
size dynamically.

I don't know if sec2 using mmap. I thought it uses unbuffered IO only.

Cheers,
Ger

"Dimitris Servis" <servisster@gmail.com> 03/20/08 12:10 PM >>>

Hi Ger,

this is a very interesting issue indeed and I have similar datasets as
well.
However, there are some things I am not sure I understand completely:

1) As far as I can tell, chunk size relates more to optimal read/write
strategy for the dataset. This means that if the dataset is resizable,
I
vary my chunk size according to use cases (or access patterns) like:
(a)
write once a fixed dataset and read frequently (b) write once a
variable
dataset and read frequently (c) update/resize a lot and read seldom or
rarely and further refinements, depending on whether each action takes
place
in a local or remote machine, whether resizing takes place and
whether new
size can be foreseen, if there is a different leading dimension when
reading
and writing and so on. But chunking relates to the allocated blocks in
the
disk AFAIK. For a resizable dataset it is clear that this is set at
creation
and cannot be changed, as chunks may be scattered within the file. If
you
change the access pattern at that point and want to change the chunk
size,
wouldn't you have to rewrite the whole dataset?

2) H5Pset_buffer can be used with the iterator you described, in order
to
accommodate the slab selection (or any multitude). Of course this is
an
application dependent strategy and IMHO it is better that it is not
predefined by the library. Therefore I write my selection iterators at
a
higher level, and can adjust my strategy depending on the urgency of
the
task.

3) If I recall correctly, sec2 driver does use memory mapping anyway.

Am I missing something?

BR

-- dimitris

2008/3/20, Ger van Diepen <diepen@astron.nl>:

Hi Quincey,

It should be clear that this discussion is for very large arrays not
fitting in memory. Otherwise you may as well read the entire array

and do

operations in memory.

IMHO HDF5 cannot prescribe access patterns; it can advise them

though.

Furthermore it could define or let the user set a maximum cache size

(which

could be really large, potentially several GB on machines with a lot

of

memory.

The best cursor shape is the chunk shape. So if an application does

not

care about order (e.g. to determine the min/max of an array), it

should

use that one.
We have applications where we need to determine the median for each

vector

or plane in a cube. This can be a vector or plane in any direction,

so my

cursor shape can be (nx,1,1) or (1,ny,1), etc. It would be nice if I

could

write such a loop as (in C++ terms):
  DataSetIterator iter(cursorshape);
  while (iter.next()) {
     iter.data() gives pointer to data
     iter.position() gives blc of current cursor position
  }
In this way HDF5 can execute the iteration in the optimal order (and

set

the cache size for me) without the user having to worry about it.
It would also be nice if the chunk cache keeps statistics which I

can

display on demand (showing the nr of reads, writes, cache hits) to

see if

the cache behaves as expected.

Another option would be to mmap the dataset, so the OS will do the

caching

for you. Of course, only on systems where it is possible. Probably it

makes

life much easier for you.

Cheers,
Ger

Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>

Hi Ger,

On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:

I'm using HDF5 to hold 3- or 4-dim data arrays (which can be

several

GBytes). The access patterns to the data can vary (even within one
application), so the data are stored in a chunked way.

Unfortunately

control over the chunk cache size seems to be very limited.
I would like to be able to set the cache size each time the access
pattern changes. However, as far as I know I can only set the

cache

size before opening (or creating) the file. It is not even

possible

to set it per dataset.

       Yes, that's currently true. We're in the process of revising

the

chunk cache itself and I'm guessing that we'll want to revise the

API

that controls it also.

For the time being I define the cache size as 16 MBytes, but it is
bound to mismatch some access patterns resulting in a great
performance loss.

What I would like is the ability to set the cache size per dataset
in a dynamic way. Thus not statically before opening the file or
dataset, but at any time.
Is it possible to add that to HDF5? Haven't other people felt that
need? To me this seems quite fundamental, otherwise you cannot

take

full advantage of chunking.
What I would like most is that I can tell HDF5 the access pattern
(i.e. cursor size and in which order it iterates over the axes)

and

let it sort out the optimal cache size.

       Hmm, I hadn't thought about that sort of replacement

algorithm for

evicting the chunks in the cache - I was planning on implementing an
LRU algorithm with a fixed-size cache. I would be worried about
allowing an application to potentially pick a "bad" order over the
axes and end up with a very large numbers of chunks cached.

       I'm very interested in hearing any ideas you (or others)

might

have
for how you think the eviction algorithm could work and the API

needed

to control it.

       Quincey

----------------------------------------------------------------------

This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to

hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------

This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to

hdf-forum-unsubscribe@hdfgroup.org.

--
- What is the difference between mechanical engineers and civil
engineers?
Mechanical engineers build weapons civil engineers build targets.
- Good health is merely the slowest possible rate at which one can
die.
- Life is like a jar of jalapeño peppers. What you do today, might Burn
Your
Butt Tomorrow

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Ger, Quincey,

OK, I see Ger's point. I usually know the access patterns before hand and
for my really big datasets (>1GB) I even know the final size and usually
have to read the whole thing anyway. I think people try to reduce the
possibility of an arbitrary hyperslab selection for particular use cases and
depending on the information they have for the dataset. But if you need to
apply any arbitrary hyperslab selection, then Ger is absolutely right.

Ger's POV is really interesting, but wouldn't it boil down to switching on a
special policy for iterations like 'keep as many chunks in memory as
possible'? Or something based on MRU at best?

BR

-- dimitris

···

2008/3/20, Quincey Koziol <koziol@hdfgroup.org>:

Hi Ger,

On Mar 20, 2008, at 6:52 AM, Ger van Diepen wrote:

> Hi Dimitris,
>
> I do not want to change the chunk size; that is fixed once you've
> created a dataset.
> What I do want to be able to define dynamically is the chunk CACHE
> size.
>
> Suppose you have a dataset array z,y,x with shape [1000,1000,1000] and
> chunk size [10,10,100]
> - when stepping through the array with a cursor shape of [10,10,100]
> the cache needs to contain one chunk only as it matchs the chunk
> shape.
> - when stepping through the array with cursor [1,1,1000], the cache
> should contain 10 chunks. This is because you need 10 chunks to get
> the
> full vector and you want those chunks to be kept in memory. However,
> this is only true if you step through the array in optimal order (thus
> 10 steps in y, thereafter 10 in z, etc.). When stepping naievely like:
> for (iz=0; iz<1000; ++iz)
> for (iy=0; iy<1000; ++iy)
> read hyperslab of shape [1,1,1000] at position [iz,iy,1000]
> the cache should be 100 times bigger or you have to accept that the
> same chunk gets read 10 times.

        Hmm, that's an interesting idea - to track sequential I/O accesses
to
a dataset and adjust the chunk cache size according to any pattern
detected. I'll add that idea to the wish list also... :slight_smile:

> I wonder how current HDF5 users do these kind of things? Are their
> arrays so small that they can always be held in memory? Or do they
> accept that the cache is too small and is thrashing?

        More people increase the chunk size enough to avoid thrashing, but
it's not terribly optimal currently, since it's at the file level -
I'll be addressing this in my current changes.

        Quincey

>
>
> Probably this kind of iteration functionality should be put in a layer
> on top of HDF5. But HDF5 should provide the means to set the chunk
> cache
> size dynamically.
>
> I don't know if sec2 using mmap. I thought it uses unbuffered IO only.
>
> Cheers,
> Ger
>
>>>> "Dimitris Servis" <servisster@gmail.com> 03/20/08 12:10 PM >>>
> Hi Ger,
>
> this is a very interesting issue indeed and I have similar datasets as
> well.
> However, there are some things I am not sure I understand completely:
>
> 1) As far as I can tell, chunk size relates more to optimal read/write
> strategy for the dataset. This means that if the dataset is
> resizable,
> I
> vary my chunk size according to use cases (or access patterns) like:
> (a)
> write once a fixed dataset and read frequently (b) write once a
> variable
> dataset and read frequently (c) update/resize a lot and read seldom or
> rarely and further refinements, depending on whether each action takes
> place
> in a local or remote machine, whether resizing takes place and
> whether new
> size can be foreseen, if there is a different leading dimension when
> reading
> and writing and so on. But chunking relates to the allocated blocks in
> the
> disk AFAIK. For a resizable dataset it is clear that this is set at
> creation
> and cannot be changed, as chunks may be scattered within the file. If
> you
> change the access pattern at that point and want to change the chunk
> size,
> wouldn't you have to rewrite the whole dataset?
>
> 2) H5Pset_buffer can be used with the iterator you described, in order
> to
> accommodate the slab selection (or any multitude). Of course this is
> an
> application dependent strategy and IMHO it is better that it is not
> predefined by the library. Therefore I write my selection iterators at
> a
> higher level, and can adjust my strategy depending on the urgency of
> the
> task.
>
> 3) If I recall correctly, sec2 driver does use memory mapping anyway.
>
> Am I missing something?
>
> BR
>
> -- dimitris
>
> 2008/3/20, Ger van Diepen <diepen@astron.nl>:
>>
>> Hi Quincey,
>>
>> It should be clear that this discussion is for very large arrays not
>> fitting in memory. Otherwise you may as well read the entire array
> and do
>> operations in memory.
>>
>> IMHO HDF5 cannot prescribe access patterns; it can advise them
> though.
>> Furthermore it could define or let the user set a maximum cache size
> (which
>> could be really large, potentially several GB on machines with a lot
> of
>> memory.
>>
>> The best cursor shape is the chunk shape. So if an application does
> not
>> care about order (e.g. to determine the min/max of an array), it
> should
>> use that one.
>> We have applications where we need to determine the median for each
> vector
>> or plane in a cube. This can be a vector or plane in any direction,
> so my
>> cursor shape can be (nx,1,1) or (1,ny,1), etc. It would be nice if I
> could
>> write such a loop as (in C++ terms):
>> DataSetIterator iter(cursorshape);
>> while (iter.next()) {
>> iter.data() gives pointer to data
>> iter.position() gives blc of current cursor position
>> }
>> In this way HDF5 can execute the iteration in the optimal order (and
> set
>> the cache size for me) without the user having to worry about it.
>> It would also be nice if the chunk cache keeps statistics which I
> can
>> display on demand (showing the nr of reads, writes, cache hits) to
> see if
>> the cache behaves as expected.
>>
>> Another option would be to mmap the dataset, so the OS will do the
> caching
>> for you. Of course, only on systems where it is possible. Probably it
> makes
>> life much easier for you.
>>
>> Cheers,
>> Ger
>>
>>>>> Quincey Koziol <koziol@hdfgroup.org> 03/18/08 4:05 PM >>>
>>
>> Hi Ger,
>>
>> On Mar 18, 2008, at 3:08 AM, Ger van Diepen wrote:
>>
>>> I'm using HDF5 to hold 3- or 4-dim data arrays (which can be
> several
>>> GBytes). The access patterns to the data can vary (even within one
>>> application), so the data are stored in a chunked way.
> Unfortunately
>>> control over the chunk cache size seems to be very limited.
>>> I would like to be able to set the cache size each time the access
>>> pattern changes. However, as far as I know I can only set the
> cache
>>> size before opening (or creating) the file. It is not even
> possible
>>> to set it per dataset.
>>
>> Yes, that's currently true. We're in the process of revising
> the
>> chunk cache itself and I'm guessing that we'll want to revise the
> API
>> that controls it also.
>>
>>> For the time being I define the cache size as 16 MBytes, but it is
>>> bound to mismatch some access patterns resulting in a great
>>> performance loss.
>>>
>>> What I would like is the ability to set the cache size per dataset
>>> in a dynamic way. Thus not statically before opening the file or
>>> dataset, but at any time.
>>> Is it possible to add that to HDF5? Haven't other people felt that
>>> need? To me this seems quite fundamental, otherwise you cannot
> take
>>> full advantage of chunking.
>>> What I would like most is that I can tell HDF5 the access pattern
>>> (i.e. cursor size and in which order it iterates over the axes)
> and
>>> let it sort out the optimal cache size.
>>
>> Hmm, I hadn't thought about that sort of replacement
> algorithm for
>> evicting the chunks in the cache - I was planning on implementing an
>> LRU algorithm with a fixed-size cache. I would be worried about
>> allowing an application to potentially pick a "bad" order over the
>> axes and end up with a very large numbers of chunks cached.
>>
>> I'm very interested in hearing any ideas you (or others)
> might
>> have
>> for how you think the eviction algorithm could work and the API
> needed
>> to control it.
>>
>> Quincey
>>
>>
>>
> ----------------------------------------------------------------------
>> This mailing list is for HDF software users discussion.
>> To subscribe to this list, send a message to
>> hdf-forum-subscribe@hdfgroup.org.
>> To unsubscribe, send a message to
> hdf-forum-unsubscribe@hdfgroup.org.
>>
>>
>>
>>
> ----------------------------------------------------------------------
>> This mailing list is for HDF software users discussion.
>> To subscribe to this list, send a message to
>> hdf-forum-subscribe@hdfgroup.org.
>> To unsubscribe, send a message to
> hdf-forum-unsubscribe@hdfgroup.org.
>>
>>
>
>
> --
> - What is the difference between mechanical engineers and civil
> engineers?
> Mechanical engineers build weapons civil engineers build targets.
> - Good health is merely the slowest possible rate at which one can
> die.
> - Life is like a jar of jalapeño peppers. What you do today, might
> Burn
> Your
> Butt Tomorrow
>
>
> ----------------------------------------------------------------------
> This mailing list is for HDF software users discussion.
> To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org
> .
> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.
>
>

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
- What is the difference between mechanical engineers and civil engineers?
Mechanical engineers build weapons civil engineers build targets.
- Good health is merely the slowest possible rate at which one can die.
- Life is like a jar of jalapeño peppers. What you do today, might Burn Your
Butt Tomorrow

Hi Ger,

A Thursday 20 March 2008, Ger van Diepen escrigué:

Hi Dimitris,

I do not want to change the chunk size; that is fixed once you've
created a dataset.
What I do want to be able to define dynamically is the chunk CACHE
size.

Suppose you have a dataset array z,y,x with shape [1000,1000,1000]
and chunk size [10,10,100]
- when stepping through the array with a cursor shape of [10,10,100]
the cache needs to contain one chunk only as it matchs the chunk
shape. - when stepping through the array with cursor [1,1,1000], the
cache should contain 10 chunks. This is because you need 10 chunks to
get the full vector and you want those chunks to be kept in memory.
However, this is only true if you step through the array in optimal
order (thus 10 steps in y, thereafter 10 in z, etc.). When stepping
naievely like: for (iz=0; iz<1000; ++iz)
     for (iy=0; iy<1000; ++iy)
       read hyperslab of shape [1,1,1000] at position [iz,iy,1000]
the cache should be 100 times bigger or you have to accept that the
same chunk gets read 10 times.

Well, it is read 10 times, yes, but only the first from disk. The
remaining 9 will most probably be read from the filesystem OS cache,
which makes a huge difference (probably 100x faster if you are not
using filters and 10x if using a compression filter with a decently
fast de-compressor).

Mmm, letting HDF5 to adjust the amount of cache memory depending on the
application needs (read access patterns) is a pretty hairy thing. In
the end, you should always need to put a limit for the amount of memory
that should be used by the application, because if not, you risk asking
too much memory to the OS, which may end doing trashing because of a
demand for a large virtual memory, and this will be rather
counter-productive. I'd much prefer to stablish a limit for the HDF5
chunk cache (as it is now), and let the OS to use as much memory as it
can for filesystem cache (it can surely do a much better job in that
regard than HDF5 itself, because it has access to the internals of the
memory usage for the hosting machine, and knows perfectly the safe
amount of RAM to be used for filesystem caching purposes without start
doing thrashing).

I wonder how current HDF5 users do these kind of things? Are their
arrays so small that they can always be held in memory? Or do they
accept that the cache is too small and is thrashing?

What do you mean here by thrashing? Caches do not work like virtual
memory where all the info has to be kept in disk, but rather they get
rid of evicted data without having to keep track of it at all. So,
when a cache is too small, the only thing it does is loading and
evicting data, and those operations can be made relatively fast
(provided that the loading/eviction algorithm is not too complex) in
comparison with a read operation. So, in most ocasions, when the cache
is small compared with the cursor size, it should not degrade too much
the read performance (typically a few percentage).

Having said this, I agree that allowing a separate cache size for
different datasets could be a good idea, but it should be the
responsability of the application developer to ask for large caches.
I'd be sceptical about HDF5 automatically choosing cache sizes, though.

My 2 (quickly appreciating :() euro-cents,

···

--

0,0< Francesc Altet http://www.carabos.com/

V V Cárabos Coop. V. Enjoy Data
"-"

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

A Monday 24 March 2008, Quincey Koziol escrigué:

> I've tried
> to change and/or deactivate the chunk cache in HDF5 with the next
> calls:
>
> dcpl = H5Fget_access_plist(file_id);
> H5Pset_cache(dcpl, 0, 128, 1024*1024, 0.0); /* 1 MB cache
> */ /* H5Pset_cache(dcpl, 0, 512, 4*1024*1024, 0.0); */ /* 4 MB
> cache */
> /* H5Pset_cache(dcpl, 0, 0, 0, 0.0); */ /* deactivate cache */
> H5Pclose(dcpl);
>
> after file opening, with no visible effect at all (while our LRU
> implementation was being able to accelerate access up to 10x
> faster). What I'm doing wrong?

  The H5Pset_cache() routine works on a file access property list (not
a dataset creation property list, as you seem to be indicating
through your variable naming). It's also not dynamically adjustable
while the file is open, so you'll need to close all the IDs for the
file and reopen it with different parameters for this sort of
experiment.

Yes. My problem was that I was trying to change the cache size after
the file opening. I'm now doing this during opening time, and it seems
to work well.

After some experiments, I've noticed some things that I'd like to bring
to your attention:

1. The chunk cache in HDF5 seems to keep the chunks compressed. Why is
this? One of the nice things about a cache in HDF5 itself should be to
keep chunks uncompressed so that the retrieval is much faster (around
to 3x when using zlib). Besides, the OS already caches compressed
chunks; so having this info replicated, and not obtaining a measurable
speed-up, seems a waste of space.

2. When retrieving entire chunks from the cache, the HDF5 chunk cache
does not seem very efficient and, in fact, it is slower than if cache
is not active, even for access patterns with high spatial locality. My
guess is that the chunk cache in HDF5 is more oriented for accelerating
the reading of small data buckets, instead on complete sets of chunks.
I don't know, but perhaps adding a parameter to the H5Pset_cache to
indicate the typical size of data to be retrieved could be useful for
optimizing the size of cache and its internal structures.

3. When retrieving large data from caches (one or several chunks), many
time is spent copying the data to user buffers, but in many cases, the
data doesn't need a fresh data container (for example, when it will be
used to doing calculations, which is a very common situation). So,
maybe adding the concept of read-only data when delivering it to the
user can be useful in order to accelerate the access to data in caches.
I've tested this with our implementation, and the access can be up to a
40% faster than if you had to provide a copy.

Just some thoughts,

···

--

0,0< Francesc Altet http://www.carabos.com/

V V Cárabos Coop. V. Enjoy Data
"-"

A Friday 21 March 2008, escriguéreu:

Hi Francesc,

Thanks for your very valuable comments.
You are right that the OS (at least Unix variants) will keep all or
most of the recently used disk pages in memory, so indeed the next
access is much faster. But you have no control over that and you
start using more system time.

That's true, but the point is that you are not using as much time as you
can initially think of, and in many cases, this is a minor loss.

If your cache is so big that it needs virtual memory and starts
paging to disk, you have a similar problem with a small cache and
hoping the disk pages are in memory. The problem is simply too big
for the machine.

Well, not exactly. When you have a small cache, you are loosing cache
efficiency, and perhaps paying a small penalty for having it, but that's
all. However, when your application start demanding too much memory to
the OS, you risk the latter to start demanding more memory than RAM is
available, and this could cause the OS to use virtual memory. If
the memory demands of applications are large enough, your OS can
eventually end doing thrashing, in the usual (and scaring) mean of
thrashing in OS, see:

I would like to see some numbers for tests using a cache that is big
enough and a cache that is just too small.
Maybe you are right and the difference is only a few percent. If that
is true, does it mean you may as well discard the chunk cacheT?

Well, I'm curious as well about figures so, as it happens that we've
implemented a LRU cache for general purposes in our product PyTables
Pro (mainly for nodes and metadata of indexes), I've setup a small
benchmark for caching chunks in HDF5.

The benchmark consists on reading records from a table in HDF5 (using
the H5TB API), which has a size of 2.5 MB, and where the chunksize is
8 KB. I've chosen such a small dataset because I mainly wanted to
compare the efficiency of our LRU cache implementation versus the OS
filesystem cache (a memory cache should perform always better or at
least equal when compared with disk reads), and 2.5 MB fits perfectly
on it. The LRU cache has 128 slots and 1 MB as maximum, but in the
experiments it has been shrunk out accordingly so as to get different
efficiencies.

Here are my figures:
                                     No compression Zlib compression
No LRU cache (reads from OS cache): 17.0 Krec/s 6.5 Krec/s
LRU cache speedup (0.999 eff.): +500% +1300%
LRU cache speedup (0.975 eff.): +340% +1000%
LRU cache speedup (0.479 eff.): +14% +47%
LRU cache speedup (0.000 eff.): -47% -14%

So, for moderate to high efficiencies, and if the dataset is compressed,
our LRU cache implementation can be more than 10x faster than OS
caching. For no efficiency at all, the presence of the LRU cache can
represent a loss of up to a 15% in speed if you are using compression,
which is not too bad. The worst case is when your cache has no
efficiency at all and data is not compressed, reaching almost a 50% of
slowdown.

While a 50% of slowdown is more than I expected, my guess is that most
of the datasets outthere are compressed. Most importantly, if the
datasets are compressed, the efficiency of the LRU cache is unaffected
(because the data is kept uncompressed), while the efficiency of the OS
cache is better as it can be placed more compressed chunks in the same
amount of memory. All in all, an LRU cache and dataset compression
make a good couple, and it is a must in a compressed database engine,
IMHO.

I agree the cache cannot be unlimited; that's why I said that HDF5 or
the user should be able to set a maximum cache size.

OK, we agree with this then. In this case, we should speak about
dynamically reducing the cache size and not about increasing it. The
main point is to avoid the cache to grow without limits.

Sometimes an LRU scheme can be very bad and MRU is much better. That
is the case if a cache is (just) too small and you access chunks in a
repetitive way (as in my example). LRU will thrash, while MRU keeps
as much data as possible in the cache.

Maybe. However, the LRU has the virtue of always keeping the last chunk
accessed, and when doing sequential partial reads, this is critical for
performance. Perhaps a MRU+last-accessed-chunk schema would be a better
one. In addition, it seems to me that the latter can complement best
the OS filesystem cache (I don't know for sure, but I think it should
follow a kind of LRU schema).

All in all, you brought many interesting ideas to think about :slight_smile:

Cheers,

···

--

0,0< Francesc Altet http://www.carabos.com/

V V Cárabos Coop. V. Enjoy Data
"-"

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Francesc,

A Monday 24 March 2008, Quincey Koziol escrigué:

I've tried
to change and/or deactivate the chunk cache in HDF5 with the next
calls:

dcpl = H5Fget_access_plist(file_id);
H5Pset_cache(dcpl, 0, 128, 1024*1024, 0.0); /* 1 MB cache
*/ /* H5Pset_cache(dcpl, 0, 512, 4*1024*1024, 0.0); */ /* 4 MB
cache */
/* H5Pset_cache(dcpl, 0, 0, 0, 0.0); */ /* deactivate cache */
H5Pclose(dcpl);

after file opening, with no visible effect at all (while our LRU
implementation was being able to accelerate access up to 10x
faster). What I'm doing wrong?

  The H5Pset_cache() routine works on a file access property list (not
a dataset creation property list, as you seem to be indicating
through your variable naming). It's also not dynamically adjustable
while the file is open, so you'll need to close all the IDs for the
file and reopen it with different parameters for this sort of
experiment.

Yes. My problem was that I was trying to change the cache size after
the file opening. I'm now doing this during opening time, and it seems
to work well.

After some experiments, I've noticed some things that I'd like to bring
to your attention:

1. The chunk cache in HDF5 seems to keep the chunks compressed. Why is
this? One of the nice things about a cache in HDF5 itself should be to
keep chunks uncompressed so that the retrieval is much faster (around
to 3x when using zlib). Besides, the OS already caches compressed
chunks; so having this info replicated, and not obtaining a measurable
speed-up, seems a waste of space.

  No, the HDF5 library caches uncompressed chunks.

2. When retrieving entire chunks from the cache, the HDF5 chunk cache
does not seem very efficient and, in fact, it is slower than if cache
is not active, even for access patterns with high spatial locality. My
guess is that the chunk cache in HDF5 is more oriented for accelerating
the reading of small data buckets, instead on complete sets of chunks.
I don't know, but perhaps adding a parameter to the H5Pset_cache to
indicate the typical size of data to be retrieved could be useful for
optimizing the size of cache and its internal structures.

  If the chunk cache is not large enough to hold at least one chunk, this sometimes happens. It's one of the effects that I'm going to try to mitigate with my forthcoming chunk caching improvements.

3. When retrieving large data from caches (one or several chunks), many
time is spent copying the data to user buffers, but in many cases, the
data doesn't need a fresh data container (for example, when it will be
used to doing calculations, which is a very common situation). So,
maybe adding the concept of read-only data when delivering it to the
user can be useful in order to accelerate the access to data in caches.
I've tested this with our implementation, and the access can be up to a
40% faster than if you had to provide a copy.

  This is an interesting idea, but it would require the library to have a supply of "read only" buffer to loan to the application, which would then be responsible for checking them back in with the HDF5 library. I'm not certain most users would like this model...

  Quincey

···

On Mar 25, 2008, at 2:09 PM, Francesc Altet wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Doesn't the nature of chunking force you to make a copy. I.e. the user
will expect an array (section) in natural order, while the data are
chunked in the HDF5 dataset. So you have to copy the various bits and
pieces to rearrange the data.

BTW. Compression is no option for us. The data are floating point and
noisy, but may contain signal. Each bit is important.

Cheers,
Ger

Quincey Koziol <koziol@hdfgroup.org> 03/25/08 11:17 PM >>>

Hi Francesc,

A Monday 24 March 2008, Quincey Koziol escrigué:

I've tried
to change and/or deactivate the chunk cache in HDF5 with the next
calls:

dcpl = H5Fget_access_plist(file_id);
H5Pset_cache(dcpl, 0, 128, 1024*1024, 0.0); /* 1 MB cache
*/ /* H5Pset_cache(dcpl, 0, 512, 4*1024*1024, 0.0); */ /* 4 MB
cache */
/* H5Pset_cache(dcpl, 0, 0, 0, 0.0); */ /* deactivate cache

*/

H5Pclose(dcpl);

after file opening, with no visible effect at all (while our LRU
implementation was being able to accelerate access up to 10x
faster). What I'm doing wrong?

  The H5Pset_cache() routine works on a file access property list

(not

a dataset creation property list, as you seem to be indicating
through your variable naming). It's also not dynamically

adjustable

while the file is open, so you'll need to close all the IDs for the
file and reopen it with different parameters for this sort of
experiment.

Yes. My problem was that I was trying to change the cache size

after

the file opening. I'm now doing this during opening time, and it
seems
to work well.

After some experiments, I've noticed some things that I'd like to
bring
to your attention:

1. The chunk cache in HDF5 seems to keep the chunks compressed. Why

is
this? One of the nice things about a cache in HDF5 itself should be

to
keep chunks uncompressed so that the retrieval is much faster

(around

to 3x when using zlib). Besides, the OS already caches compressed
chunks; so having this info replicated, and not obtaining a

measurable

speed-up, seems a waste of space.

  No, the HDF5 library caches uncompressed chunks.

2. When retrieving entire chunks from the cache, the HDF5 chunk

cache

does not seem very efficient and, in fact, it is slower than if

cache

is not active, even for access patterns with high spatial locality.

My
guess is that the chunk cache in HDF5 is more oriented for
accelerating
the reading of small data buckets, instead on complete sets of

chunks.

I don't know, but perhaps adding a parameter to the H5Pset_cache to
indicate the typical size of data to be retrieved could be useful

for

optimizing the size of cache and its internal structures.

  If the chunk cache is not large enough to hold at least one
chunk,
this sometimes happens. It's one of the effects that I'm going to try

to mitigate with my forthcoming chunk caching improvements.

3. When retrieving large data from caches (one or several chunks),
many
time is spent copying the data to user buffers, but in many cases,

the

data doesn't need a fresh data container (for example, when it will

be

used to doing calculations, which is a very common situation). So,
maybe adding the concept of read-only data when delivering it to the
user can be useful in order to accelerate the access to data in
caches.
I've tested this with our implementation, and the access can be up
to a
40% faster than if you had to provide a copy.

  This is an interesting idea, but it would require the library to
have
a supply of "read only" buffer to loan to the application, which would

then be responsible for checking them back in with the HDF5 library.

I'm not certain most users would like this model...

  Quincey

···

On Mar 25, 2008, at 2:09 PM, Francesc Altet wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Quincey,

A Tuesday 25 March 2008, escriguéreu:

> After some experiments, I've noticed some things that I'd like to
> bring
> to your attention:
>
> 1. The chunk cache in HDF5 seems to keep the chunks compressed.
> Why is
> this? One of the nice things about a cache in HDF5 itself should
> be to
> keep chunks uncompressed so that the retrieval is much faster
> (around to 3x when using zlib). Besides, the OS already caches
> compressed chunks; so having this info replicated, and not
> obtaining a measurable speed-up, seems a waste of space.

  No, the HDF5 library caches uncompressed chunks.

There is something that I don't understand in my experiments, then.
I've repeated them today, and I can confirm them. My setup is reading
entire chunks (8184 bytes) of a chunked table (with 100000, 24-byte,
records for a total of 2.4 MB or 341 chunks) out of an HDF5 file.

Here are my results:

Using no compression

···

~~~~~~~~~~~~~~~~~~~~
- Disabling the internal caches (just using OS filesystem cache):
  21.0 Kchunks/sec

- Using a LRU cache for chunks
Efficiency HDF5 1.8.0 PyTables Pro PyTables Pro (read-only)
0.000 18.0 16.0 16.0
0.479 18.5 23.5 28.0
0.975 20.5 57.0 93.0
0.999 21.0 82.0 117.

Using compression (zlib level 1 + shuffle)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Disabling the internal caches (just using OS filesystem cache):
  7.60 Kchunks/sec

- Using a LRU cache for chunks
Efficiency HDF5 1.8.0 PyTables Pro PyTables Pro (read-only)
0.000 7.60 6.70 6.90
0.479 7.40 11.3 12.2
0.975 6.90 50.3 77.0
0.999 7.00 79.0 112.

All the figures in the above tables are in Kchunks per second. PyTables
Pro means the LRU cache used on it (I've disabled the HDF5 cache in
this case). The read-only qualifier means that the data delivered is
meant only for read only purposes (i.e. it is non memcpy-ed into user
buffers). The sizes of the caches in both HDF5 and PyTables has been
set to 1 MB and 128 slots (except for the 0.000 efficiency entry, where
the number of slots has been reduced to 12), and I've chosen the chunks
to be selected following a normal distribution with different sigmas in
order to achieve different efficiencies.

As you can see, the performance of the chunk cache in HDF5 (1.8.0, but
1.6.7 seems to behave similarly) when using compression is much worse
than if not using it (contrarily to the PyTables Pro cache that is
affected only by a few extent, as expected). That's why I thought that
the HDF5 chunk cache was keeping the data compressed :-/

> 2. When retrieving entire chunks from the cache, the HDF5 chunk
> cache does not seem very efficient and, in fact, it is slower than
> if cache is not active, even for access patterns with high spatial
> locality. My
> guess is that the chunk cache in HDF5 is more oriented for
> accelerating
> the reading of small data buckets, instead on complete sets of
> chunks. I don't know, but perhaps adding a parameter to the
> H5Pset_cache to indicate the typical size of data to be retrieved
> could be useful for optimizing the size of cache and its internal
> structures.

  If the chunk cache is not large enough to hold at least one chunk,
this sometimes happens. It's one of the effects that I'm going to
try to mitigate with my forthcoming chunk caching improvements.

Well, what I meant is that when retrieving complete chunks (in the
benchmark above, the chunksize was around 8 KB and the cache size 1 MB,
so it has capacity for up to 128 chunks) the performance of the HDF5
chunk cache is quite poor (and effectively slower than if the cache is
disabled). You can see this effect in the tables above.

I'm rather mystified about this, but I'm quite sure that I'm
enabling/disabling the HDF5 chunk cache correctly (I can see better
speed-ups with retrievals smaller than a chunk). I was using the next
code to enable/disable the cache in HDF5:

access_plist = H5Pcreate(H5P_FILE_ACCESS);
/* H5Pset_cache(access_plist, 0, 0, 0, 0.0); */ /* disable cache*/
H5Pset_cache(access_plist, 0, 128, 128*8*1024, 0.0); /* 1 MB, 128 slots
*/
file_id = H5Fopen(name, H5F_ACC_RDONLY, access_plist)

So, I'm wondering if perhaps something is not working as it should in
the chunk cache when asking for relatively large data buckets (but,
still, much smaller than cache size).

> 3. When retrieving large data from caches (one or several chunks),
> many
> time is spent copying the data to user buffers, but in many cases,
> the data doesn't need a fresh data container (for example, when it
> will be used to doing calculations, which is a very common
> situation). So, maybe adding the concept of read-only data when
> delivering it to the user can be useful in order to accelerate the
> access to data in caches.
> I've tested this with our implementation, and the access can be up
> to a
> 40% faster than if you had to provide a copy.

  This is an interesting idea, but it would require the library to
have a supply of "read only" buffer to loan to the application, which
would then be responsible for checking them back in with the HDF5
library. I'm not certain most users would like this model...

Yeah. I thought this after sending the message. However, it is
fortunate that PyTables users can benefit from NumPy containers that
support the concept of read-only buffers right-out-of-the-box.
Provided the rather large improvement in speed that one can expect,
I'll look at integrating the read-only caches in next version of
PyTables Pro.

Cheers,

--

0,0< Francesc Altet http://www.carabos.com/

V V Cárabos Coop. V. Enjoy Data
"-"

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.