SWMR - extend dataset by 1000 records and write records one by one

Hi,

My question is on SWMR. I have an one dimension array with unlimited size. In the writer, I get the records one by one (not in a block of records).
My implementation is based on swmr_addrem_writer.c from hdf5 tests.
For each record, I call H5Dset_extent, H5Dget_space, H5Sselect_hyperslab, and H5Dwrite.
I’m trying to improve runtime performance by calling H5Dset_extent and H5Dget_space less. I extended the dataset by 1000 Instead of by 1, and then called H5Sselect_hyperslab, and H5Dwrite 1000 times for each record. I got ~50% runtime improvement. For example in pseudo code:

hsize_t size[] = {1000};
H5Dset_extent(dataset, size);
hid = hidFileSpace = H5Dget_space(dataset);
for (int i = 0; i < 1000; i++)
{
  H5Sselect_hyperslab ... // move to last record
  H5Dwrite ... // write one record
}

The output *.h5 is fine. Just have “zeroed” records at the end of the dataset for the unwritten records (1000 - H5Dwrite calls in the last chunk).

Is it safe to do it? Will the SWMR reader see “zeroed” records during the writing (in the middle/end)?
Is it safe from SWMR point of view to reduce the dataset size to the exact size with H5Dset_extent just before closing the dataset?

Thanks,
Maoz

May I ask what is the throughout you are getting with this approach?

sizeof(object) * nelements / 1 second and type of computer / hdd

I’m trying to improve writer run-time by calling hdf5 APIs less. I got an improvement, but don’t know if it will * work when SWMR reader is running in parallel.

Common for both cases:
Computer: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
Hdd: NFS - I know SWMR is not supported on it. I'm currently developing a writer as a POC.
OS: Sles11
I/O from strace:
* write: 1055
* lseek: 48
* other: 10
H5Pset_chunk is set to 1000 and H5Pset_chunk_cache is set to allocate all 1000 records in the cache.

Without improvement (extend by 1):
--------------------------------------------------------------
| dataset | sizeof(object) | number of elements | Total size |
| --------|----------------|--------------------|------------|
| 1       | 32             | 163,940            | 5,246,080  |
| 2       | 24             | 180,324            | 4,327,776  |
| 3       | 32             | 163,930            | 5,245,760  |
| 4       | 32             | 163,930            | 5,245,760  |
| 5       | 24             | 163,930            | 3,934,320  |
| 6       | 32             | 163,930            | 5,245,760  |
|---------|----------------|--------------------|------------|
| Total   |                |                    | 29,245,456 |
--------------------------------------------------------------
file size (bytes)        : 29,306,340
user time (seconds)      : 12.58
system time (seconds)    : 0.21
total time (seconds)     : 12.79
write rate (bytes/second): 2,286,587

With improvement (extend by 1000):
--------------------------------------------------------------
| dataset | sizeof(object) | number of elements | Total size |
| --------|----------------|--------------------|------------|
| 1       | 32             | 164,000            | 5,248,000  |
| 2       | 24             | 181,000            | 4,344,000  |
| 3       | 32             | 164,000            | 5,248,000  |
| 4       | 32             | 164,000            | 5,248,000  |
| 5       | 24             | 164,000            | 3,936,000  |
| 6       | 32             | 164,000            | 5,248,000  |
|---------|----------------|--------------------|------------|
| Total   |                |                    | 29,272,000 |
--------------------------------------------------------------
file size (bytes)        : 29,306,340
user time (seconds)      : 7.28
system time (seconds)    : 0.18
total time (seconds)     : 7.46
write rate (bytes/second): 3,923,860

Write rate is up from 2,286,587 to 3,923,860.

Same test case when dumping a binary file with C++ ofstream gives ~4.5X writing rate comparing to hdf5.
Total run-time that I used in the writing rate formula includes the “overhead” that generates the objects from a simulator.

Thanks,
Maoz

Hi Maoz!

Extending by one record is no way currently, but might it be that the bug is fixed someday
(or at least, enters JIRA)?

Best wishes,
Andrey Paramonov

The reason I was asking for some metrics is to ballpark your effort. In the current version of H5CPP the record/element wise read/write rate is about on par with raw C IO: on a Lenovo x230 laptop 350MB/sec for datasets size greater than system memory. And 1.5 - 3.5 GB/sec for small burts IO requests.

The comparison was against armadillo save/load: arma_binary, raw_binary.
Therefore if it is for speed improvement please consider evaluating H5CPP h5::append operator performance and or the techniques used there.
Although h5::append filter chain is not plugged in yet, the operator is functional with good IO properties and can take arbitrary T :== POD types | fundamental types and stl::vector and major linear algebra objects upto rank 3.

here is an example to append an std::vector from examples/packet_table

...	
h5::fd_t fd = h5::create("example.h5",H5F_ACC_TRUNC);
// dataset descriptor is automatically converted to packet table descriptor
h5::pt_t pt = h5::create<double>(fd, "stream of vectors",
	h5::max_dims{H5S_UNLIMITED,5}, h5::chunk{1,5} );

	arma::vec V(5); for(int i=0; i<5; i++) V(i) = i;
// the actual append operation
	for( int i = 0; i < 7; i++)
	    h5::append( pt, V);