Reading dataset values with number of dimension 2.

I have a HDF5 based application to read the hdf5 file which has dataset
values with number of dimensions 2. I am using HDF 1.10.1 Windows10 x64.
I allocate array of pointers to read the data values in the same way as
given in the hdf5 documentation. Here is the code which is reading the
dataset values.
In this code, the dynamic 2D array of pointers is not initialized as per
the standard way using a loop.

unique_ptr<T*[]> apbuffer = make_unique<T*[]>(size_of_dimensions[0]);
T** buffer = apbuffer.get();
unique_ptr<T[]> apbuffer1 = make_unique<T[]>(size_of_dimensions[0] *
size_of_dimensions[1]);
buffer[0] = apbuffer1.get();
for (int i = 1; i < size_of_dimensions[0]; i++) {
        buffer[i] = buffer[0] + i * size_of_dimensions[1];
}
H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT,
buffer[0]);

Can we allocate the buffer like the below code?
T** buffer= new T*[dims[0]];
for(int i = 0; i < dims[0]; ++i)
    buffer[i] = new T[dims[1]];

I would like to know, what are other possible ways to allocate the buffer
to read 2 dimension dataset values?
Any insight is greatly appreciated.

I think for a variable size multidimensional array in C++, it's more common to just use a one-dimensional array, and do the array indexing math yourself (or with inline utility methods). Using an array of pointers to arrays just adds extra levels of indirection, more memory lookups, slower code.

Your second code snippet will not work, because the H5Dread is going to write the values to the buffer directly, not to the arrays pointed to by the buffer. You'll want to do this instead:
T* buffer = new T[dims[1] * dims[0]];

Rather than trying to shoehorn the built-in multi-dim array syntax for your array by using the array of pointers, consider building up a wrapper class that owns the single-dim array, and having inline accessor methods can help with the semantics of lookup. And internally, you can have it use smart pointers, or ensure aligned data, or whatever. Then you can have something like:

ArrayWrapper<double> v(128,256);
v.value(123,234) = 1234.5678;
cout << v.value(1,2) << endl;

H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT, v.data());

Your compiler should be able to inline the calls to .value() so that it is equivalent to v.data()[v.xsize()*y + x]. If not, it will still be much faster than loading the address of a sub-array from memory and then reading from that address. (as long as your access patterns match the memory layout of the array)

Jarom

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Deepak 8 Kumar
Sent: Wednesday, July 19, 2017 6:50 AM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] Reading dataset values with number of dimension 2.

I have a HDF5 based application to read the hdf5 file which has dataset values with number of dimensions 2. I am using HDF 1.10.1 Windows10 x64.
I allocate array of pointers to read the data values in the same way as given in the hdf5 documentation. Here is the code which is reading the dataset values.
In this code, the dynamic 2D array of pointers is not initialized as per the standard way using a loop.

unique_ptr<T*[]> apbuffer = make_unique<T*[]>(size_of_dimensions[0]);
T** buffer = apbuffer.get();
unique_ptr<T[]> apbuffer1 = make_unique<T[]>(size_of_dimensions[0] * size_of_dimensions[1]);
buffer[0] = apbuffer1.get();
for (int i = 1; i < size_of_dimensions[0]; i++) {
        buffer[i] = buffer[0] + i * size_of_dimensions[1];
}
H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT, buffer[0]);

Can we allocate the buffer like the below code?
T** buffer= new T*[dims[0]];
for(int i = 0; i < dims[0]; ++i)
    buffer[i] = new T[dims[1]];

I would like to know, what are other possible ways to allocate the buffer to read 2 dimension dataset values?
Any insight is greatly appreciated.

If you are interested, the HighFive hdf5 C++ bindings allow you to read / write any 2 dimensional using boost::ublas::matrix with only few lines of codes :

PS: I am one of the author of HighFive.

Regards,
Adrien Devresse

···

On July 19, 2017 8:32:40 PM GMT+02:00, "Nelson, Jarom" <nelson99@llnl.gov> wrote:

I think for a variable size multidimensional array in C++, it's more
common to just use a one-dimensional array, and do the array indexing
math yourself (or with inline utility methods). Using an array of
pointers to arrays just adds extra levels of indirection, more memory
lookups, slower code.

Your second code snippet will not work, because the H5Dread is going to
write the values to the buffer directly, not to the arrays pointed to
by the buffer. You'll want to do this instead:
T* buffer = new T[dims[1] * dims[0]];

Rather than trying to shoehorn the built-in multi-dim array syntax for
your array by using the array of pointers, consider building up a
wrapper class that owns the single-dim array, and having inline
accessor methods can help with the semantics of lookup. And internally,
you can have it use smart pointers, or ensure aligned data, or
whatever. Then you can have something like:

ArrayWrapper<double> v(128,256);
v.value(123,234) = 1234.5678;
cout << v.value(1,2) << endl;

H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT,
v.data());

Your compiler should be able to inline the calls to .value() so that it
is equivalent to v.data()[v.xsize()*y + x]. If not, it will still be
much faster than loading the address of a sub-array from memory and
then reading from that address. (as long as your access patterns match
the memory layout of the array)

Jarom

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf
Of Deepak 8 Kumar
Sent: Wednesday, July 19, 2017 6:50 AM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] Reading dataset values with number of dimension 2.

I have a HDF5 based application to read the hdf5 file which has dataset
values with number of dimensions 2. I am using HDF 1.10.1 Windows10
x64.
I allocate array of pointers to read the data values in the same way as
given in the hdf5 documentation. Here is the code which is reading the
dataset values.
In this code, the dynamic 2D array of pointers is not initialized as
per the standard way using a loop.

unique_ptr<T*[]> apbuffer = make_unique<T*[]>(size_of_dimensions[0]);
T** buffer = apbuffer.get();
unique_ptr<T[]> apbuffer1 = make_unique<T[]>(size_of_dimensions[0] *
size_of_dimensions[1]);
buffer[0] = apbuffer1.get();
for (int i = 1; i < size_of_dimensions[0]; i++) {
       buffer[i] = buffer[0] + i * size_of_dimensions[1];
}
H5Dread(dataset_id, dataset_type_id, H5S_ALL, H5S_ALL, H5P_DEFAULT,
buffer[0]);

Can we allocate the buffer like the below code?
T** buffer= new T*[dims[0]];
for(int i = 0; i < dims[0]; ++i)
   buffer[i] = new T[dims[1]];

I would like to know, what are other possible ways to allocate the
buffer to read 2 dimension dataset values?
Any insight is greatly appreciated.

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.