Investigating Memory Growth with HDF5/NetCDF-4

I have a code that periodically writes output results data to a NetCDF-4 file (with HDF5 underneath). If I run the program using NetCDF-3 output, I get no memory growth, but if I use NetCDF-4, I get continual memory growth proportional to the number of output steps.

At each output step, I write the same amount of data. There are no memory leaks, so it looks like there is a buffer or cache that is growing over time. I do “flush” the data after each output step. I did run this with valgrind --massif and it does show the memory growth over time. Here is an output near the beginning of the program showing the largest memory block:

98.62% (20,747,915B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->82.85% (17,431,077B) 0x7E48714: H5MM_malloc (H5MM.c:292)
| ->80.84% (17,007,799B) 0x7DB51ED: H5FL_malloc (H5FL.c:243)
| | ->68.41% (14,392,479B) 0x7DB6024: H5FL_blk_malloc (H5FL.c:921)
| | | ->59.92% (12,605,504B) 0x7CFE7AF: H5D__chunk_mem_alloc (H5Dchunk.c:1363)
| | | | ->59.90% (12,601,344B) 0x7D03AC9: H5D__chunk_lock (H5Dchunk.c:3587)
| | | | | ->59.90% (12,601,344B) 0x7D013D9: H5D__chunk_write (H5Dchunk.c:2392)
| | | | |   ->59.90% (12,601,344B) 0x7D2E640: H5D__write (H5Dio.c:817)
| | | | |     ->59.90% (12,601,344B) 0x7D2C4FE: H5Dwrite (H5Dio.c:335)
| | | | |       ->59.90% (12,601,344B) 0x777CCAC: NC4_put_vars (hdf5var.c:1739)
| | | | |         ->59.90% (12,601,344B) 0x777C047: NC4_put_vara (hdf5var.c:1325)
| | | | |           ->59.90% (12,601,344B) 0x7716AC0: NC_put_vara (dvarput.c:97)
| | | | |             ->59.90% (12,601,344B) 0x7717F42: nc_put_vara_double (dvarput.c:711)

And here is a similar dump near the end of the program showing the same traceback and much higher memory usage:

99.50% (102,556,387B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->96.28% (99,243,541B) 0x7E48714: H5MM_malloc (H5MM.c:292)
| ->95.04% (97,965,223B) 0x7DB51ED: H5FL_malloc (H5FL.c:243)
| | ->89.29% (92,029,991B) 0x7DB6024: H5FL_blk_malloc (H5FL.c:921)
| | | ->81.86% (84,380,616B) 0x7CFE7AF: H5D__chunk_mem_alloc (H5Dchunk.c:1363)
| | | | ->81.86% (84,372,296B) 0x7D03AC9: H5D__chunk_lock (H5Dchunk.c:3587)
| | | | | ->81.86% (84,372,296B) 0x7D013D9: H5D__chunk_write (H5Dchunk.c:2392)
| | | | |   ->81.86% (84,372,296B) 0x7D2E640: H5D__write (H5Dio.c:817)
| | | | |     ->81.86% (84,372,296B) 0x7D2C4FE: H5Dwrite (H5Dio.c:335)
| | | | |       ->81.86% (84,372,296B) 0x777CCAC: NC4_put_vars (hdf5var.c:1739)
| | | | |         ->81.86% (84,372,296B) 0x777C047: NC4_put_vara (hdf5var.c:1325)
| | | | |           ->81.86% (84,372,296B) 0x7716AC0: NC_put_vara (dvarput.c:97)
| | | | |             ->81.86% (84,372,296B) 0x7717F42: nc_put_vara_double (dvarput.c:711)

Each output step should write about 186,480 bytes of “bulk” data (10 calls with 10,648 bytes + 10 calls with 8,000 bytes) and there are around 1,000 steps in this run. The first trace above should be near the first few output steps and the last trace is near the end of the run.

Is this an expected result due to buffering of the data? If so, is there a way to “flush” the buffer or reduce the buffer size. It does look like the memory growth does “top out” part way through the run; here is massif’s representation of memory use over time:

    MB
98.30^                                                                      # 
     |                                   @::::::::@@::@:::::@::@:::::@:::::@#:
     |                               :::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                           @::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                          :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                       ::::@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                      @:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                    ::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                  @:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                 @@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |                @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |             @@@@@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |            :@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |          :@:@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |         ::@:@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |       @@::@:@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |     @:@@::@:@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |    :@:@@::@:@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     |   @:@:@@::@:@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
     | :@@:@:@@::@:@@ @@@:::@:: :@::@:::@@::::::::@ ::@:::::@::@:::::@:::::@#:
   0 +----------------------------------------------------------------------->Gi
     0                                                                   7.061

Any insights would be welcome.
Thanks,
…Greg

Further investigation does seem to point toward a buffer growth causing this. I guess the main question is how to control the size of the buffering (if possible) and is there a way to “flush” periodically to make sure the data is on the disk in case of a application crash. We are calling the NetCDF nc_sync function.

…Greg

Hi Greg,

I assume you are using a pretty recent version of HDF5? Is there an easy way to try to close and re-open the file between iterations? I know that isn’t ideal but if you get improved memory performance it suggests even more that the problem is in what HDF5 is holding in memory for the file? Do you have any way to gain access to the HDF5 file handle and if so, can you adjust metadata cache configuration? If so, I have seen this set of config parameters behave better for restart/plot dump scenarios.

It may be that netCDF’s use of HDF5 should be updated to include some of the newer metadata cache configuration features available in HDF5 such as the “evict on close” feature and/or a better set of mdc_config params? This is something we should ensure is part of the dialog between THG and Unidata.

Hello!

I entered bug HDFFV-10915 for this issue. We discussed it and have some questions:

Which version of HDF5 are you using?
Are you accessing the same dataset over and over or different datasets?
Can you try closing and re-opening the dataset(s) between iterations?
If there is more than one file, can you close them between iterations?

If possible, can you try calling H5garbage_collect to see what happens? See:
https://portal.hdfgroup.org/display/HDF5/H5_GARBAGE_COLLECT

Thanks!
-Barbara

Hi Greg,
You may want to call H5garbage_collect() and see if that helps push the memory usage down.

	Quincey

Sorry for the delay in responding. Here are a few questions answered; will try some of the suggestions hopefully this week:

  1. Version – Using HDF5-1.10.5

  2. Multiple datasets are being accessed each “timestep” and the datasets are kept open. The code will dump some data to each of the datasets each timestep and then repeat for each subsequent timestep.

  3. Closing file between iterations: This does reduce the overall memory growth. Keeping everything open results in peak memory use of 114 MiB; if I close and reopen the file each iteration, the peak memory use is 83 MiB or a reduction of 31 MiB. There is only a single file. Note that this increases run time by 5.5X

In the above, I continue to append to the end of the datasets when the file is reopened. There is another option in which I create a new file for each iteration step. So write iteration; close file; open new file; repeat… This option gives peak memory of 67 MiB.

Will look at other suggestions later this week…

Had some extra time to try some of the suggestions…

  1. H5garbage_collect – Minimal affect – dropped peak from 114 MiB to 113 MiB.

  2. close and reopen dataset instead of keeping them all open – This had a large affect. The peak for the “smaller” model was dropped from 114 MiB down to 31 MiB (83 MiB decrease). In this case, the execution time increased by about 1.5X. However, on a larger model, the peak dropped from 172 MiB down to 99 MiB (73 MiB decrease) and the execution time decreased by about 10%.

This seems to be a good option to investigate with the NetCDF 4 developers. My current hack is just to close and immediately reopen right after the H5Dwrite call which probably isn’t the best implementation, but does show the different behavior.

I haven’t tried any of the metadata cache options yet.