some hidden cache - is this a bug or a feature

Hi there
  we (a colleague and I) stumbled upon an interesting behavior of the
HDF5 library (we are using 1.8.11 on Debian Wheezy).
We try to write a single 2D array of data to an HDF5 dataset with an
absolutely stupid chunking scheme (see the test() function in the
attached source file). As a result the library allocates quite a lot of
memory (around 3 GByte). What surprised us is that this memory is not
freed even after closing the file. Moreover, it does not grow when
calling the test() several times as can be seen in the output of the
attached program

./test
Startup ...
RSS - 6.880000e+02 kB
Shared Memory - 5.200000e+02 kB
Private Memory - 1.680000e+02 kB

After first write ...
RSS - 2.916884e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.914724e+06 kB

After second write ...
RSS - 2.921896e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.919736e+06 kB

Obviously this is not a resource leak in the classical sense. My
suspicion is that the memory is occupied by some persistent cache.
Which leads me to my question: is there a possibility to free this
memory?

regards
  Eugen

test.c (2.15 KB)

···

--
---------------------------------------
DI. Dr. Eugen Wintersberger
                             
FS-EC
DESY
Notkestr. 85
D-22607 Hamburg
Germany

E-Mail: eugen.wintersberger@desy.de
Telefon: +49-40-8998-1917
---------------------------------------

Hi Eugen,

It may be a feature (or a bug :wink:

In preparation for I/O HDF5 allocates internal structures for handling chunks. The overhead for each chunk is a couple Ks (cmp. with the size of the chunk!). After I/O is done, memory required for handling the structures is put on a "free" list for reuse. 3GBs seems little-bit too much and we will need to investigate.

Meanwhile, could you please try to call H5garbage_collect http://www.hdfgroup.org/HDF5/doc/RM/RM_H5.html#Library-GarbageCollect after H5Dwrite to see if memory is released?

You may also try to write just one chunk after the "big" write. It should also release memory.

But as I said, we will need to look more closely at memory consumption and see if any improvements/tuning/fixes could be done. I'll add the issue to our database.

Thank you!

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Dec 11, 2013, at 2:33 AM, "Wintersberger, Eugen" <eugen.wintersberger@desy.de> wrote:

Hi there
we (a colleague and I) stumbled upon an interesting behavior of the
HDF5 library (we are using 1.8.11 on Debian Wheezy).
We try to write a single 2D array of data to an HDF5 dataset with an
absolutely stupid chunking scheme (see the test() function in the
attached source file). As a result the library allocates quite a lot of
memory (around 3 GByte). What surprised us is that this memory is not
freed even after closing the file. Moreover, it does not grow when
calling the test() several times as can be seen in the output of the
attached program

./test
Startup ...
RSS - 6.880000e+02 kB
Shared Memory - 5.200000e+02 kB
Private Memory - 1.680000e+02 kB

After first write ...
RSS - 2.916884e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.914724e+06 kB

After second write ...
RSS - 2.921896e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.919736e+06 kB

Obviously this is not a resource leak in the classical sense. My
suspicion is that the memory is occupied by some persistent cache.
Which leads me to my question: is there a possibility to free this
memory?

regards
Eugen

--
---------------------------------------
DI. Dr. Eugen Wintersberger

FS-EC
DESY
Notkestr. 85
D-22607 Hamburg
Germany

E-Mail: eugen.wintersberger@desy.de
Telefon: +49-40-8998-1917
---------------------------------------
<test.c>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Hi Elena

Hi Eugen,

It may be a feature (or a bug :wink:

In preparation for I/O HDF5 allocates internal structures for handling chunks. The overhead for each chunk is a couple Ks (cmp. with the size of the chunk!). After I/O is done, memory required for handling the structures is put on a "free" list for reuse. 3GBs seems little-bit too much and we will need to investigate.

Ok. That's what I have expected.

Meanwhile, could you please try to call H5garbage_collect http://www.hdfgroup.org/HDF5/doc/RM/RM_H5.html#Library-GarbageCollect after H5Dwrite to see if memory is released?

Did not have any effect.

You may also try to write just one chunk after the "big" write. It should also release memory.

No. It does not :wink: - at least not in my case.

But as I said, we will need to look more closely at memory consumption and see if any improvements/tuning/fixes could be done. I'll add the issue to our database.

Attached is a new version of the program I have used for testing.

regards
  Eugen

test.c (3.54 KB)

···

On Mon, 2013-12-16 at 22:56 -0600, Elena Pourmal wrote:

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Dec 11, 2013, at 2:33 AM, "Wintersberger, Eugen" <eugen.wintersberger@desy.de> wrote:

> Hi there
> we (a colleague and I) stumbled upon an interesting behavior of the
> HDF5 library (we are using 1.8.11 on Debian Wheezy).
> We try to write a single 2D array of data to an HDF5 dataset with an
> absolutely stupid chunking scheme (see the test() function in the
> attached source file). As a result the library allocates quite a lot of
> memory (around 3 GByte). What surprised us is that this memory is not
> freed even after closing the file. Moreover, it does not grow when
> calling the test() several times as can be seen in the output of the
> attached program
>
> ./test
> Startup ...
> RSS - 6.880000e+02 kB
> Shared Memory - 5.200000e+02 kB
> Private Memory - 1.680000e+02 kB
>
> After first write ...
> RSS - 2.916884e+06 kB
> Shared Memory - 2.160000e+03 kB
> Private Memory - 2.914724e+06 kB
>
> After second write ...
> RSS - 2.921896e+06 kB
> Shared Memory - 2.160000e+03 kB
> Private Memory - 2.919736e+06 kB
>
> Obviously this is not a resource leak in the classical sense. My
> suspicion is that the memory is occupied by some persistent cache.
> Which leads me to my question: is there a possibility to free this
> memory?
>
> regards
> Eugen
>
> --
> ---------------------------------------
> DI. Dr. Eugen Wintersberger
>
> FS-EC
> DESY
> Notkestr. 85
> D-22607 Hamburg
> Germany
>
> E-Mail: eugen.wintersberger@desy.de
> Telefon: +49-40-8998-1917
> ---------------------------------------
> <test.c>_______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@lists.hdfgroup.org
> http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Eugen,

Thank you for checking! I really appreciate your help. The JIRA issue is HDFFV-8645.

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Dec 17, 2013, at 12:55 AM, "Wintersberger, Eugen" <eugen.wintersberger@desy.de> wrote:

Hi Elena
On Mon, 2013-12-16 at 22:56 -0600, Elena Pourmal wrote:

Hi Eugen,

It may be a feature (or a bug :wink:

In preparation for I/O HDF5 allocates internal structures for handling chunks. The overhead for each chunk is a couple Ks (cmp. with the size of the chunk!). After I/O is done, memory required for handling the structures is put on a "free" list for reuse. 3GBs seems little-bit too much and we will need to investigate.

Ok. That's what I have expected.

Meanwhile, could you please try to call H5garbage_collect http://www.hdfgroup.org/HDF5/doc/RM/RM_H5.html#Library-GarbageCollect after H5Dwrite to see if memory is released?

Did not have any effect.

You may also try to write just one chunk after the "big" write. It should also release memory.

No. It does not :wink: - at least not in my case.

But as I said, we will need to look more closely at memory consumption and see if any improvements/tuning/fixes could be done. I'll add the issue to our database.

Attached is a new version of the program I have used for testing.

regards
Eugen

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Dec 11, 2013, at 2:33 AM, "Wintersberger, Eugen" <eugen.wintersberger@desy.de> wrote:

Hi there
we (a colleague and I) stumbled upon an interesting behavior of the
HDF5 library (we are using 1.8.11 on Debian Wheezy).
We try to write a single 2D array of data to an HDF5 dataset with an
absolutely stupid chunking scheme (see the test() function in the
attached source file). As a result the library allocates quite a lot of
memory (around 3 GByte). What surprised us is that this memory is not
freed even after closing the file. Moreover, it does not grow when
calling the test() several times as can be seen in the output of the
attached program

./test
Startup ...
RSS - 6.880000e+02 kB
Shared Memory - 5.200000e+02 kB
Private Memory - 1.680000e+02 kB

After first write ...
RSS - 2.916884e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.914724e+06 kB

After second write ...
RSS - 2.921896e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.919736e+06 kB

Obviously this is not a resource leak in the classical sense. My
suspicion is that the memory is occupied by some persistent cache.
Which leads me to my question: is there a possibility to free this
memory?

regards
Eugen

--
---------------------------------------
DI. Dr. Eugen Wintersberger

FS-EC
DESY
Notkestr. 85
D-22607 Hamburg
Germany

E-Mail: eugen.wintersberger@desy.de
Telefon: +49-40-8998-1917
---------------------------------------
<test.c>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

<test.c>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Hi all

I did not check the code but it is not uncommon when writing a large
dataset using very small chunks that the chunk handling structures allocate
more memory than the chunk itself....

HTH

Dimitris

Στις Τρίτη, 17 Δεκεμβρίου 2013, ο χρήστης Elena Pourmal έγραψε:

···

Eugen,

Thank you for checking! I really appreciate your help. The JIRA issue is
HDFFV-8645.

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Dec 17, 2013, at 12:55 AM, "Wintersberger, Eugen" < > eugen.wintersberger@desy.de> wrote:

Hi Elena
On Mon, 2013-12-16 at 22:56 -0600, Elena Pourmal wrote:

Hi Eugen,

It may be a feature (or a bug :wink:

In preparation for I/O HDF5 allocates internal structures for handling
chunks. The overhead for each chunk is a couple Ks (cmp. with the size of
the chunk!). After I/O is done, memory required for handling the structures
is put on a "free" list for reuse. 3GBs seems little-bit too much and we
will need to investigate.

Ok. That's what I have expected.

Meanwhile, could you please try to call H5garbage_collect
http://www.hdfgroup.org/HDF5/doc/RM/RM_H5.html#Library-GarbageCollectafter H5Dwrite to see if memory is released?

Did not have any effect.

You may also try to write just one chunk after the "big" write. It should
also release memory.

No. It does not :wink: - at least not in my case.

But as I said, we will need to look more closely at memory consumption and
see if any improvements/tuning/fixes could be done. I'll add the issue to
our database.

Attached is a new version of the program I have used for testing.

regards
Eugen

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Dec 11, 2013, at 2:33 AM, "Wintersberger, Eugen" < > eugen.wintersberger@desy.de> wrote:

Hi there
we (a colleague and I) stumbled upon an interesting behavior of the
HDF5 library (we are using 1.8.11 on Debian Wheezy).
We try to write a single 2D array of data to an HDF5 dataset with an
absolutely stupid chunking scheme (see the test() function in the
attached source file). As a result the library allocates quite a lot of
memory (around 3 GByte). What surprised us is that this memory is not
freed even after closing the file. Moreover, it does not grow when
calling the test() several times as can be seen in the output of the
attached program

./test
Startup ...
RSS - 6.880000e+02 kB
Shared Memory - 5.200000e+02 kB
Private Memory - 1.680000e+02 kB

After first write ...
RSS - 2.916884e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.914724e+06 kB

After second write ...
RSS - 2.921896e+06 kB
Shared Memory - 2.160000e+03 kB
Private Memory - 2.919736e+06 kB

Obviously this is not a resource leak in the classical sense. My
suspicion is that the memory is occupied by some persistent cache.
Which leads me to my question: is there a possibility to free this
memory?

regards
Eugen

--
---------------------------------------
DI. Dr. Eugen Wintersberger

FS-EC
DESY
Notkestr. 85
D-22607 Hamburg
Germany

E-Mail: eugen.wintersberger@desy.de
Telefon: +49-40-8998-1917
---------------------------------------
<test.c>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

<test.c>_______________________________________________
Hdf-forum is for HDF software users discussion.