Increases the use of the system RAM until it reaches RAM saturation


#1

I have to continuously acquire and save 512 analog signals on disk for an indefinite time.
Application created with MSVC15 and LabWindows / CVI / 2019. For connection needs with the data acquisition system, the application is developed in 32 bit.
I downloaded hdf5-1.12.0 and compiled with MSVC15 + cMake as indicated.
The application creates an H5 file with a two-dimensional 192 data-set (time-value). One data set per channel.
The created file is correct, HDFView-3.1.0 displayed.

The application generates approximately 500 Gbytes of data per day.
The problem is that it increases the use of the system RAM until it reaches RAM saturation after 2-3 days of work.
Then when the RAM is full, the application goes into error because it can no longer allocate more memory.

PC with Windows 10/64 bit, 8GB, and 16GB of virtual memory.

Creazione file --------------------------------------------------------------------------------------------------------------------------------------

if ( (LowLevelAcq->file_id = H5Fcreate( “file_name.h5” , H5F_ACC_TRUNC , H5P_DEFAULT , H5P_DEFAULT )) < 0 ) H5_ERROR_GOTO
if ( (LowLevelAcq->datatype = H5Tcopy( H5T_NATIVE_DOUBLE )) < 0 ) H5_ERROR_GOTO if ( H5Tset_order( LowLevelAcq->datatype, H5T_ORDER_LE ) < 0 ) H5_ERROR_GOTO

if ( (pnt_ch->gid_a = H5Gcreate2( LowLevelAcq->file_id , “Generic” , H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT)) < 0 ) H5_ERROR_GOTO
if ( (pnt_ch->gid_b = H5Gcreate2( pnt_ch->gid_a , "channel_name(1…192) , H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT)) < 0 ) H5_ERROR_GOTO

if ( (pnt_ch->gid_c = H5Gcreate2( pnt_ch->gid_b , “RAW” , H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT)) < 0 ) H5_ERROR_GOTO
if ( (pnt_ch->gid_d = H5Gcreate2( pnt_ch->gid_b , “PreProcess” , H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT)) < 0 ) H5_ERROR_GOTO

Creazione data-set --------------------------------------------------------------------------------------------------------------------------------------

if ( (pnt_ch->plist = H5Pcreate ( H5P_DATASET_CREATE )) < 0 ) H5_ERROR_GOTO
if ( H5Pset_chunk ( pnt_ch->plist, RANK, chunk_dims ) < 0 ) H5_ERROR_GOTO
if ( (pnt_ch->dataspace_id = H5Screate_simple ( RANK, dims, maxdims )) < 0 ) H5_ERROR_GOTO

if ( (pnt_ch->dataset_id = H5Dcreate2( pnt_ch->gid_c , "date_time , LowLevelAcq->datatype, pnt_ch->dataspace_id, H5P_DEFAULT , pnt_ch->plist , H5P_DEFAULT )) < 0 ) H5_ERROR_GOTO

dapl = H5Dget_access_plist(pnt_ch->dataset_id);
mdc_nelmts = 0;
status = H5Pget_cache( dapl, &mdc_nelmts, &rdcc_nslots, &rdcc_nbytes, &rdcc_w0 );
status = H5Pget_chunk_cache( dapl, &rdcc_nslots, &rdcc_nbytes, &rdcc_w0 ); rdcc_nslots = 0; rdcc_nbytes = 0;
//rdcc_w0 = 0
status = H5Pset_chunk_cache( dapl, rdcc_nslots, rdcc_nbytes, rdcc_w0 );

Loop Scrittura --------------------------------------------------------------------------------------------------------------------------------------

Parte di scrittura ripetura per canale. Eseguita circa ogni 1 millisec.

if ( pnt_ch->dataset_id )
{
offset[0] = pnt_ch->size; // row
offset[1] = 0; //colum; //1; //0; // col

count [0] = itms;
count [1] = colum;		//1;

//stride [0] = 1;
//stride [1] = 1;

new_size[0] = pnt_ch->size +itms;
new_size[1] = RANK;

if ( H5Dset_extent( pnt_ch->dataset_id, new_size ) < 0 ) H5_ERROR_GOTO
if ( (pnt_ch->filespace_id = H5Dget_space( pnt_ch->dataset_id )) < 0 ) H5_ERROR_GOTO
if ( H5Sselect_hyperslab( pnt_ch->filespace_id	, H5S_SELECT_SET , offset , NULL , count , NULL	 ) < 0 ) H5_ERROR_GOTO
if ( H5Sset_extent_simple( pnt_ch->dataspace_id, RANK , count , count ) < 0 ) H5_ERROR_GOTO
if ( H5Dwrite( pnt_ch->dataset_id, LowLevelAcq->datatype, pnt_ch->dataspace_id, pnt_ch->filespace_id, H5P_DEFAULT, Waveform ) < 0 ) H5_ERROR_GOTO

pnt_ch->size += itms;
}

Thank you.


#2

The example is garbled but offers a few clues. Where are you closing the pnt_ch->filespace_id handle that you acquire in H5Dget_space? If you keep re-assigning that w/o previously closing the existing one, that’s a memory leak right there. Generally, the most common source of the kind of leakage you are seeing is a lack of handle discipline. “You acquire a handle, you own it (and the resources that go with it).” (Colin Powell) Make sure that there is a matching H5*close with each hid_t handle you acquire. Use H5Fget_obj_count (https://portal.hdfgroup.org/display/HDF5/H5F_GET_OBJ_COUNT) to check for open handles. Use H5F_OBJ_ALL for types. Check that before shutting down your application! If the count returned is greater than one (you have at least one open file handle), that typically signals trouble. G.


#3

Continuing the discussion from Increases the use of the system RAM until it reaches RAM saturation :

Thanks Gheber

Where are you closing the pnt_ch->filespace_id handle that you acquire in H5Dget_space?
–> In the part I copied, the last rows are missing.

I close the handle “pnt_ch-> filespace_id” after writing to the dataset.


#define H5_ERROR_GOTO { H5Eprint( H5E_DEFAULT, LowLevelAcq->fl_err ); status = 0 - (LINE); goto Error; }

#define RANK 2

int WriteHDF5( void *h5_acq_hndl, int32_t chnnl, int32_t itms, double Waveform[], int colum, int time_stamp )
{
int status=-1;
hsize_t offset[RANK];
hsize_t count [RANK];
hsize_t new_size[RANK];
//hid_t filespace=NULL;
time_t tt;

tagLowLevelAcq *LowLevelAcq = h5_acq_hndl;
tagLowLevelAcqOneCh *pnt_ch = &LowLevelAcq->sLowLevelAcqOneCh[chnnl];

if ( pnt_ch->dataset_id )
{
offset[0] = pnt_ch->size; // row
offset[1] = 0; //colum; //1; //0; // col

count [0] = itms;
count [1] = colum;        //1;

//stride [0] = 1;
//stride [1] = 1;

new_size[0] = pnt_ch->size +itms;
new_size[1] = RANK;

if ( H5Dset_extent( pnt_ch->dataset_id, new_size ) < 0 ) H5_ERROR_GOTO
if ( (pnt_ch->filespace_id = H5Dget_space( pnt_ch->dataset_id )) < 0 ) H5_ERROR_GOTO
if ( H5Sselect_hyperslab( pnt_ch->filespace_id, H5S_SELECT_SET, offset,  NULL, count, NULL ) < 0 ) H5_ERROR_GOTO
if ( H5Sset_extent_simple( pnt_ch->dataspace_id, RANK, count, count ) < 0 ) H5_ERROR_GOTO
if ( H5Dwrite( pnt_ch->dataset_id , LowLevelAcq->datatype, pnt_ch->dataspace_id, pnt_ch->filespace_id, H5P_DEFAULT, Waveform) < 0 ) H5_ERROR_GOTO

pnt_ch->size += itms;
hdf5_bytes_wri += itms;
status = 0;
}

Error:
if ( pnt_ch->filespace_id ) { H5Sclose( pnt_ch->filespace_id ); pnt_ch->filespace_id = 0; }
return status;
}

// Acquisition loop (simplified) ---------------------------------------------------------------------------------------------

while( !done )
{
for( i = 0; i < 512; i++ ) // Process all channels
{
// read analog channel
w = AcqurieAnalogStream( i, dBuffer );

	// Store
	WriteHDF5( acqData[0].h5_acq_hndl, w, 1, dBuffer 2, 1 );	// --> Continuous increase of the memory used
	}
}

The system has to acquire for several days, but after 2-3 days it saturates the RAM (physical and virtual).

Thank you


#4

Can you send the example as an attachment? (or check the preview) It’s just too difficult to parse. G.


#5

https://1drv.ms/u/s!AgXBOutj0lx0zg3CQh-IFOi_waeA?e=StWWb7

link code.
thank you


#6

Can you insert a snippet like this near the end of WriteHDF5 and report back if the count remains constant?

hid_t file;
assert((file = H5Iget_file_id(pnt_ch->dataset_id) >= 0);
printf("Open handle count: %ld\n", H5Fget_obj_count(file, H5F_OBJ_ALL));
assert(H5Fclose(file) >= 0);   

G.


#7

H5Dwrite (…);
file = H5Iget_file_id (pnt_ch-> dataset_id)
pnt_ch-> objCount = H5Fget_obj_count (file, H5F_OBJ_ALL)
H5Fclose (file); file = 0;

the “objCount” value remains unchanged at 769. Even after several minutes of writing. Generated file> 1 gb.

Thank you


#8

A colleague of mine suggested taking a look at H5set_free_list_limits (https://portal.hdfgroup.org/display/HDF5/H5_SET_FREE_LIST_LIMITS) and maybe trim that down. (See also H5garbage_collect.)

Did you run your application w/ Valgrind for additional clues? Unfortunately, H5Iget_file_id doesn’t cover all flavors of HDF5 handles, and there are probably other non-HDF5 handles in your application.

G.


#9

I see that you mention H5garbage_collect. I have had similar symptoms than the OP and reported in

but I never got an answer about how the garbage collection works. Could elaborate on that please? e.g. When do we have to call it? What is collected?

In the meantime, I’ll try to monitor the free list sizes with H5get_free_list_sizes().

Thanks,
Samuel


#10

The library manages free-lists for three types of memory uses (regular, array, block). You can control the limits via H5set_free_list_limits if the defaults seem low or excessive. These free-lists and defaults are library-global. Most (all?) allocations first check the corresponding free-list, and that’s why it is usually unnecessary to call H5garbage_collect manually.

Another knob to consider is the metadata cache configuration. I believe that, by default, it will grow incrementally to 32 MB starting at 2 MB.

At the end of the day, this is just doctoring around the edges. We need to establish if and what is leaking memory. Have you looked at config/sanitizer/README.md and built/run a sanitized library version?

G.