Fine to have some more feedback! 
Did the attached spreadsheet make it's way to the forum? Or was it
filtered?
A Monday 12 April 2010 13:58:24 Stamminger, Johannes escrigué:
> I did some detailed performance tests this morning (I try to attach the
> spreadsheet - I don't know if this forum allows this).
>
> For my testdata (~1.100.000 varying length strings, summerized of size
> 488MB) I found
>
> i) it *very* supprising to see the performance results' variance: on my
> (else idling) development machine best to worst measured runs always
> differed by 25-40%! No explanation of this is coming to my mind up to
> now ...
> This means that differences below 5% maybe only caused by random (I ran
> each configuration 6-11 times - but this seems not enough to me
> regarding the variance).
That could be consequence of the cache disk subsystem of the OS that is
working on. If you want to get better reproducibility on your results, try to
flush the OS cache (sync with UNIX-like OS) before taking time measurements.
Of course, you may not be interested in measuring your disk I/O, but only the
disk cache subsystem throughput, but this is always tricky to do.
Maybe. Though I never before noticed such a variance (and never thought
of any explicite sync'ing) ...
Btw: I'm running an 64bit linux (latest ubuntu) with raid0 filesystem.
But I use the 32bit version of the hdf library.
And additionally please note that I run the tests from a JAVA unit test!
>
> ii) one cannot talk of any compression: difference from level -1/0 to
> level 9 is just 1,39% percent in the resulting hdf file's size
> (~970MB) 
For what I know, compression of variable length types is not supported by HDF5
yet. By forcing the use of a compression filter there, you are only
compressing the *pointers* to your variable length values, not the values
themselves.
I read already something like this 
> iii) "compression" level 5 seems best choice taking into account
> additionally performance (4% overhead compared to 10% using level 9).
> But regard i) reading this - maybe only a random ...
>
> iv) my strings are much shorter than yours. With mine I observe that it
> is best to write ~350 of mine in a block with a chunksize of 16K. The
> number of strings writing in a block makes the biggest difference: 1 =>
> 425s, 10 => 55s, 100 => 21s, 350 => 20s, 600 => 22s, 1000 => 23s.
>
> v) with always writing 100 strings in a block, the chunksize makes a
> difference of max. 10% (tested with 128 bytes up to 64K). But with
> chunksize of 128K perfromance degraded by factor 10 to 682s for a single
> run.
Don't know about this one, but it is certainly strange this dramatic loss in
performance when passing from 64 KB to 128 KB chunksize. It would be nice if
you can build a small benchmark showing this problem in performance and send
it to the HDF group for further analysis.
I may extract this test with some small much effort. But it is java then
wrapping the native shared libraries. And it is *not* hdf-java as this
does not support the H5PT but using JNA for that purpose.
Still interested?
> Next I will try to use an array type of fixed length to see some working
> compression.
IMO, this is your best bet if you are after compressing your data. BTW, when
I'm still measuring - but I was supprised again from the findings. E.g.
with arrays of size 16384 it seems best to use chunksize 32, compression
level 4 and to write as much as possible (maybe there is an upper limit
that I did not reach, yet) arrays with a single call to H5PTappend. With
that I get the data written in 217s to a file of size 160MB.
The data is the same as I used for writing the strings. But now without
conversion to hex string, 468M bytes in sum. With the overhead of the
fixed length arrays total data written to the file is of size 16,2G (the
overhead bytes are zero'ed). With the latter in mind the resulting
filesize of 160M is quite imaginable. But compared with writing same
data to a zip with on-the-fly inflation it is not as this leads to 50M
in 65s (with no performance tuning like writing data in blocks etc) ...
With big chunksize both, performance and file size, degrade by a large
factor. Worst example was to have 819K leading to a file of 513M (50
arrays with 16384 bytes each, compression 0, chunk size 32K).
If my attachment made it to the list I will provide a table again then.
Maybe I try something like using multiple packet tables in parallel with
each using different array sizes ... ?
sending strings to HDF5 containers be sure to zero the memory buffer area
after the end of the string: this could improve compression ratio quite a lot.
I'm using H5PT - I do not see any method for doing such thing there.
What method did you think of?
Thanks for every hint!
Johannes Stamminger
···
On Mo, 2010-04-12 at 16:39 +0200, Francesc Alted wrote: