Performance in the disk storage

Hi,

I have realised compression tests from only HDF5 library.
In these tests the size of chunks are varied in the following way:

1) Firstly, I have changed the size of chunks with even values: 10*10, 12*12, 14*14, 16*16, etc;
2)Secondly, I have modified the size of chunks with odd values: 11*11, 13*13, 15*15, 17*17, etc.

I have got 2 graphic lines. Please find enclosed the file called "dataset_square_sdisk_pair_impair_1000.png"

The x numbers are the size of chunks in octets.
The y numbers are the size of file in the disk in bytes

-The red line represents chunks with an even size (10*10,12*12,...);
-The green line represents chunks with an odd size (11*11, 13*13, etc).

My question is: Do we have a reason that would explain why we observe a better performance in the disk storage with chunks of an even size that an odd size ?

Thanks you,

Rolih

Hi,

  the performance is probably best if the chunk size is an integer multiple of the disk's block size, which is usually a power of two, unless you're limited by caching and memory buffer / copying.

If you used compression during your disk tests, which compression filters did you use? The read / write performance of all those various filters and their settings would be quite interesting to see. I've got best performance with the LZ4 filter (which was actually recommended here in the mailing list some time ago), it achieves write speed nearly as good as uncompressed data. Whereas the intrinsic zip filters achieve 2x as good compression rate, but at the cost of being significantly slower, in some cases even 200x slower.

       Werner

···

On 06.09.2016 15:20, MEYNARD Rolih wrote:

Hi,

I have realised compression tests from only HDF5 library.
In these tests the size of chunks are varied in the following way:

1) Firstly, I have changed the size of chunks with even values: 10*10, 12*12, 14*14, 16*16, etc;
2)Secondly, I have modified the size of chunks with odd values: 11*11, 13*13, 15*15, 17*17, etc.

I have got 2 graphic lines. Please find enclosed the file called "dataset_square_sdisk_pair_impair_1000.png"

The x numbers are the size of chunks in octets.
The y numbers are the size of file in the disk in bytes

-The red line represents chunks with an even size (10*10,12*12,...);
-The green line represents chunks with an odd size (11*11, 13*13, etc).

My question is: Do we have a reason that would explain why we observe a better performance in the disk storage with chunks of an even size that an odd size ?

Thanks you,

Rolih

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Center for Computation & Technology at Louisiana State University (CCT/LSU)
2019 Digital Media Center, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Thank you Werner for your answer,

Actually, I have only used ZLIB filter with as level of compression 6 (setdeflate(6))...

Quoting Werner Benger <werner@cct.lsu.edu>:

···

Hi,

the performance is probably best if the chunk size is an integer multiple of the disk's block size, which is usually a power of two, unless you're limited by caching and memory buffer / copying.

If you used compression during your disk tests, which compression filters did you use? The read / write performance of all those various filters and their settings would be quite interesting to see. I've got best performance with the LZ4 filter (which was actually recommended here in the mailing list some time ago), it achieves write speed nearly as good as uncompressed data. Whereas the intrinsic zip filters achieve 2x as good compression rate, but at the cost of being significantly slower, in some cases even 200x slower.

      Werner

On 06.09.2016 15:20, MEYNARD Rolih wrote:

Hi,

I have realised compression tests from only HDF5 library.
In these tests the size of chunks are varied in the following way:

1) Firstly, I have changed the size of chunks with even values: 10*10, 12*12, 14*14, 16*16, etc;
2)Secondly, I have modified the size of chunks with odd values: 11*11, 13*13, 15*15, 17*17, etc.

I have got 2 graphic lines. Please find enclosed the file called "dataset_square_sdisk_pair_impair_1000.png"

The x numbers are the size of chunks in octets.
The y numbers are the size of file in the disk in bytes

-The red line represents chunks with an even size (10*10,12*12,...);
-The green line represents chunks with an odd size (11*11, 13*13, etc).

My question is: Do we have a reason that would explain why we observe a better performance in the disk storage with chunks of an even size that an odd size ?

Thanks you,

Rolih

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Center for Computation & Technology at Louisiana State University (CCT/LSU)
2019 Digital Media Center, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362