H5Fget_obj_count() only works with file objects, which include files, groups, datasets, and named datatypes. HDF5 currently does not offer a function that gives such information for other IDs. We have a feature request in our database and, if we have time, we'll try to add it to a future release.
to check for specific identifier's type.
···
________________________________
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org> on behalf of Miller, Mark C. <miller86@llnl.gov>
Sent: Friday, August 14, 2015 1:51 PM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Growing memory usage in small HDF program
Hmm. I just wanted to ask THG guys a quick follow-up question here.
I didn't follow this whole thread but was this growth due to the C++ interface failing to close or dec-ref some objects?
If so, why didn't H5Oget_obj_count help to deduce that? My understanding is that Jorj tried that but it yielded no indication of an object handle leak. Is there a bug there?
Mark
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Binh-Minh Ribler <bmribler@hdfgroup.org<mailto:bmribler@hdfgroup.org>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Friday, August 14, 2015 10:03 AM
To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: Re: [Hdf-forum] Growing memory usage in small HDF program
That's good. Thank you for applying the files and letting us know, George!
Binh-Minh
________________________________
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Jorj Pimm <jorjpimm@gmail.com<mailto:jorjpimm@gmail.com>>
Sent: Friday, August 14, 2015 4:47 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Growing memory usage in small HDF program
I've applied Binh-Minh's files to my local copy of HDF and this does significantly reduce memory usage for my example.
It now uses under 100MB of memory, which seems pretty reasonable, I'll continue testing against these changes.
Thanks all for the help,
- George
On Fri, 14 Aug 2015 at 09:10 Jason Newton <nevion@gmail.com<mailto:nevion@gmail.com>> wrote:
Hmm, I did make a mistake in how p_setID works (since it decref's but does not incref it's new reference) - and I figured setId wasn't defined provided the naming.
I've never seen shared_ptr or OpenCL's C++ wrapper (which is almost a mirror image in terms of the library complexity they map) foul up or leak references in any cases with the same fundamental operations. It's unclear to me why this library can't do the same. The amount of code dedicated to those purposes in those libraries is much less than what's going on here too...
-Jason
On Fri, Aug 14, 2015 at 12:59 AM, Binh-Minh Ribler <bmribler@hdfgroup.org<mailto:bmribler@hdfgroup.org>> wrote:
Hello Jason,
________________________________
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Jason Newton <nevion@gmail.com<mailto:nevion@gmail.com>>
Sent: Thursday, August 13, 2015 10:39 PM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Growing memory usage in small HDF program
Bug found (in C++ api as usual)
Thank you for your efforts in tracking down the problem and your suggestions.
The C++ API *should* take care of inc/dec ref appropriately although they do this in each object class (may be higher in some class hierarchies like datatypes) but something of a leaf otherwise, rather than through inheritance of IdComponent. That strategy while working has left a few bugs I've found / encountered both as leaks and dec'reffing references not incref'd. As of 1.8.15, all that I was aware of though but this concern should be warranted all the time based on past-burnings (this would be the third time noticing something a shared_ptr like class/wrapper around HDF resources (IdComponent...?) would completely eliminate.
dataset.getSpace() leaks a reference:
//create dataspace object using the existing id then return the object
DataSpace data_space; <--default constructor makes a valid hdf dataspace for H5S_SCALAR
f_DataSpace_setId(&data_space, dataspace_id); <-- evil line, why didn't we just use the ctor that takes the id parameter?
return( data_space );
In 1.8.14, this block of code is like this, before it was changed into using the friend function in 1.8.15.
//create dataspace object using the existing id then return the object
DataSpace data_space(dataspace_id);
return(data_space);
As you can see in the comments you included below, the friend function was a work-around of a problem reported by some other users. In that problem, the id was prematurely closed, due to the behind-the-scene copy-constructor/destructor when an object was returned from a function. In order to fix that problem, the copy constructor and the constructor that takes an existing id of those classes that associate with an HDF5 id needs to increment the ref counter.
However, incrementing ref count left some objects opened at the end of the program, perhaps, due to some compiler's optimization when returning an object to the caller. In these situations, a destructor for a temporary object didn't seem to be invoked, so the id ref of the temporary object was never closed. I could never figure out why. Hence, the work-around was to use p_setId instead, which required the use of the friend function. If anyone has a different suggestion, please let us know.
//--------------------------------------------------------------------------
// Function: f_DataSpace_setId - friend
// Purpose: This function is friend to class H5::DataSpace so that it can
// can set DataSpace::id in order to work around a problem
// described in the JIRA issue HDFFV-7947.
// Applications shouldn't need to use it.
// param dspace - IN/OUT: DataSpace object to be changed
// param new_id - IN: New id to set
// Programmer Binh-Minh Ribler - 2015
//--------------------------------------------------------------------------
void f_DataSpace_setId(DataSpace* dspace, hid_t new_id) <--evil function that shouldn't exist (as a friend no-less!)
{
dspace->id = new_id; <-- why not dspace->p_setId(new_id);? Just make it public already as "reset" and get rid of the friend. Follow shared_ptr semantics.. and bring all this stuff inside IdComponent.
.
}
The difference between the public "setId" and the private p_setId is that "setId" also increments the ref count and is intended for applications to use on the C++ object id. The private p_setId doesn't increment the id ref count and is not intended for application use. The difference is explained in the function's header.
Thank you,
Binh-Minh
-Jason
On Thu, Aug 13, 2015 at 9:37 AM, Miller, Mark C. <miller86@llnl.gov<mailto:miller86@llnl.gov>> wrote:
Hmm. Well I have no experience with HDF5's C++ interface.
My first thought when reading your description was. . . I've seen that before. It happens when I forgot to H5Xclose() all the objects I H5Xopened (groups, datasets, types, dataspaces, etc.).
However, with C++, I presume the interface is designed to close objects when they fall out of scope (e.g. deconstructor is called). So, in looking at your code, even though I don't see any explicit calls to close objects previously opened, I assume that *should* be happening when the objects fall out of scope. But, are you *certain* that *is* happening? Just before exiting main, you migth wanna make a call to H5Fget_obj_count() to get some idea how many objects HDF5 library thinks are still open in the file. If you get a large number, then that would suggest the problem is that the C++ interface isn't somehow closing objects as they fall out of scope.
Thats all I can think of. Sorry if no help.
Mark
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Jorj Pimm <jorjpimm@gmail.com<mailto:jorjpimm@gmail.com>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Thursday, August 13, 2015 9:21 AM
To: "hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>" <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: [Hdf-forum] Growing memory usage in small HDF program
Hello,
I am writing an application which writes large data sets to HDF5 files, in fixed size blocks, using the HDF C++ API (version 1.8.15, patch 1, built in msvc 2013 x64)
I my application seems to quickly consume all the available memory on my system (win32 - around 5.9GB), and then crash whenever the system becomes stressed (windows kills it as it has no memory)
I have also tested the application on a linux machine, where I saw similar results.
I was under the impression that by using HDF5, the file would be brought in and out of memory in such a way that the library would only use a small working set - is this not true?
I have experimented with HDF features such as flushing to disk, regularly closing and re opening, garbage collection and tuning chunking and caching settings and haven't managed to get a stable working set.
I've attached a minimal example, can anyone point out my mistake?
Thanks,
- Jorj
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5