Slow Attributes

I am currently changing from 1.6.5 to 1.8.7 and my application is 500 times slower with 1.8.7. I have narrowed the cause down to the creation (and use) of attributes. The application that can generate 1000's of small attributes (references to a dataset) attached to a single object. I would be grateful for any help and advice.

Thanks

Hi Rod,

···

On Jun 23, 2011, at 6:09 AM, Rod Cook wrote:

I am currently changing from 1.6.5 to 1.8.7 and my application is 500 times slower with 1.8.7. I have narrowed the cause down to the creation (and use) of attributes. The application that can generate 1000's of small attributes (references to a dataset) attached to a single object. I would be grateful for any help and advice.

  Hmm, I'm surprised that things got slower in this area. Can you send a simple C program that works with both versions and demonstrates the slowdown?

  Quincey

Quincey

I was surprised as well, especially by how much slower it was. I tried changing to 1.8.3 when it came out but found that my application was a lot slower, I didn't have the time then to investigate further and I went back to 1.6.5. I have now found the same problem with 1.8.7.

My application uses HDF via a library of wrapper rountines. I wrote a timing program using this library that mimicked the application and found that it was about 500 times slower with 1.8.7. When I removed the code that created/used attributes the times for 1.6.5 and 1.8.7 were similar. I have cut and pasted bits from my library into some code I can send you, 1.8.7 is about 40 times slower for this code -- the use of datasets and attributes in this example are much simpler than in my timing program or application. The code contains a call to the function write_attribute, if this call is removed the times for 1.6.5 and 1.8.7 are similar.

I suspect that I've misused/misunderstood HDF in some way but got away with it for 1.6.5.

Thanks for your help

Rod

hdftiming.c (7.7 KB)

···

On 23/06/2011 16:50, Quincey Koziol wrote:

Hi Rod,

On Jun 23, 2011, at 6:09 AM, Rod Cook wrote:

I am currently changing from 1.6.5 to 1.8.7 and my application is 500 times slower with 1.8.7. I have narrowed the cause down to the creation (and use) of attributes. The application that can generate 1000's of small attributes (references to a dataset) attached to a single object. I would be grateful for any help and advice.

  Hmm, I'm surprised that things got slower in this area. Can you send a simple C program that works with both versions and demonstrates the slowdown?

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Rod,

Quincey asked me to look at the problem you reported. I ran your code and saw 1.8 library is much slower than 1.6 library. I'll let you know once I find the reason.

Thanks.

Ray

···

On Jun 24, 2011, at 9:08 AM, Rod Cook wrote:

Quincey

I was surprised as well, especially by how much slower it was. I tried changing to 1.8.3 when it came out but found that my application was a lot slower, I didn't have the time then to investigate further and I went back to 1.6.5. I have now found the same problem with 1.8.7.

My application uses HDF via a library of wrapper rountines. I wrote a timing program using this library that mimicked the application and found that it was about 500 times slower with 1.8.7. When I removed the code that created/used attributes the times for 1.6.5 and 1.8.7 were similar. I have cut and pasted bits from my library into some code I can send you, 1.8.7 is about 40 times slower for this code -- the use of datasets and attributes in this example are much simpler than in my timing program or application. The code contains a call to the function write_attribute, if this call is removed the times for 1.6.5 and 1.8.7 are similar.

I suspect that I've misused/misunderstood HDF in some way but got away with it for 1.6.5.

Thanks for your help

Rod

On 23/06/2011 16:50, Quincey Koziol wrote:

Hi Rod,

On Jun 23, 2011, at 6:09 AM, Rod Cook wrote:

I am currently changing from 1.6.5 to 1.8.7 and my application is 500 times slower with 1.8.7. I have narrowed the cause down to the creation (and use) of attributes. The application that can generate 1000's of small attributes (references to a dataset) attached to a single object. I would be grateful for any help and advice.

  Hmm, I'm surprised that things got slower in this area. Can you send a simple C program that works with both versions and demonstrates the slowdown?

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

<hdftiming.c>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Rod,

I discovered that the 1.8 library does some extra management work with the attribute storage in the dataset object header. This management work can be expensive. I'm talking to my project leader about the solution.

At the same time, the 1.8 library has a new way to store large number of attributes - dense storage with heap and B-tree indexing. I've tried it and got the same performance as the 1.6 library or even better. The only thing you need to do is to enable a file access property through the function H5Pset_libver_bounds. You should set the library version to the latest:

  H5Pset_libver_bounds(fapl, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST);

Then you open the file with this file access property list (fapl). Your attributes will be stored in dense storage. You can adjust the threshold of the dense storage by setting a property through H5Pset_attr_phase_change. You can have a look at the example.c that I attached with this message.

Please let us know if you can do it in this way and if the performance improves. Thanks.

Ray

example.c (1.99 KB)