Dynamically allocated multidimensional arrays C++

steven_walton · May 6, 2016, 8:29pm

So I am noticing some interesting behavior and is wondering if there is a
way around this.
I am able so assign a rank 1 array dynamically and write this to an hdf5
filetype but I do not seem to be able to do with with higher order arrays.
I would like to be able to write a PPx array to h5 and retain the data
integrity. More specifically I am trying to create a easy to use vector to
array library <https://github.com/stevenwalton/H5Easy> that can handle
multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading directly) I
find that if I have typename PPx (not necessarily double) I get garbage
stored. Here is an example code and output from h5dump showing the
behavior.

···

------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last character
which is the first number of the actual array), (0,5) is out of bounds,
and has garbage data. (1,0) has always contained real data (though it
should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd data
as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

David1 · May 9, 2016, 5:58pm

Hi Steve,

boost::multi_array provides a clean interface for multi dimensional arrays
in C++.

You can also do something like this:

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

···

On Fri, May 6, 2016 at 1:29 PM, Steven Walton <walton.stevenj@gmail.com> wrote:

So I am noticing some interesting behavior and is wondering if there is a
way around this.
I am able so assign a rank 1 array dynamically and write this to an hdf5
filetype but I do not seem to be able to do with with higher order arrays.
I would like to be able to write a PPx array to h5 and retain the data
integrity. More specifically I am trying to create a easy to use vector to
array library <https://github.com/stevenwalton/H5Easy> that can handle
multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading directly) I
find that if I have typename PPx (not necessarily double) I get garbage
stored. Here is an example code and output from h5dump showing the
behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last
character which is the first number of the actual array), (0,5) is out of
bounds, and has garbage data. (1,0) has always contained real data (though
it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd data
as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

Nathanael_Huebbe · May 9, 2016, 1:13pm

Of course, you get garbage output: You are storing the array of pointers instead of the data,
along with whatever garbage happens to be after those pointers in memory.

Trouble is, C++ simply can't do true multidimensional arrays of dynamic size.
It's not part of the language. So you basically have two options:

1. Do the indexing yourself. Declare your multidimensional array as a 1D array, and access its elements
via `data[i*dims[1] + j]`. This is a nuisance, but still feasible.

2. Use C. C99 allows true multidimensional arrays of dynamic size. So, in C, you can just write

        double (*data)[dims[1]] = malloc(dims[0] * sizeof(*data));
        for ( size_t i = 0; i < dims[0]; ++i )
            for ( size_t j = 0; j < dims[1]; ++j )
                data[i][j] = i + j;

This will layout your data in memory the way HDF5 expects it, but it's not legal C++ code of any standard.

Of course, you can also use your pointer array, and read/write the data line by line. Or you can allocate
your data as a 1D array and alias it with a pointer array to be able to access it via `data[i][j]`.
But either way, it gets dirty.

Cheers,
Nathanael Hübbe

···

On 05/06/2016 10:29 PM, Steven Walton wrote:

So I am noticing some interesting behavior and is wondering if there is a way around this.
I am able so assign a rank 1 array dynamically and write this to an hdf5 filetype but I do not seem to be able to do with with higher order arrays. I would like to be able to write a PPx array to h5 and retain the data integrity. More specifically I am trying to create a easy to use vector to array library <https://github.com/stevenwalton/H5Easy> that can handle multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading directly) I find that if I have typename PPx (not necessarily double) I get garbage stored. Here is an example code and output from h5dump showing the behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE, dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last character which is the first number of the actual array), (0,5) is out of bounds, and has garbage data. (1,0) has always contained real data (though it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some reason intentional? As using new is a fairly standard way to assign arrays, making P...Pt type data common, I have a hard time seeing this as intentional. In the mean time is anyone aware of a workaround to this? The data I am taking in will be dynamically allocated so I do not see a way to get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

miller86 · May 9, 2016, 5:27pm

There is a third and maybe a fourth way to handle this…

3. Do the dynamic multi-Dim array as you normally would but when you turn around and write the beast to HDF5, unravel it into a temporary buffer just before H5Dwrite. Do the opposite just after H5Dread. That involves a data copy but can work just fine if the arrays are small. Its just a bit more work to write and read. This is similar to the previous respondents suggestion to "do the indexing yourself" except you don't change anything in *your* client code except the places where you interface to HDF5.

4. You may be able to do something more elegant using either HDF5 datatypes and custom type conversion routines or HDF5 filters. My first thought is a "filter" but it would be a bit of a kluge too. You define a custom filter (see https://www.hdfgroup.org/HDF5/doc/RM/RM_H5Z.html#Compression-Register) and you *ensure* that the chunk size you specify for the filter is large enough to at least cover the top-level array of pointers in your arrays. That might be a somewhat large chunk size but so what. Then, *assuming* HDF5 always sends chunks to the filter moving through memory starting with the pointer it was handed in the H5Dwrite call, upon the first entry to your filter, you would "see" the top-level set of pointers. You would have to cache those away for safe keeping inside the filter somehow. Then with each successive chunk request that comes through the filter, you would use the cached pointer structure to go find the actual chunk being processed in memory and then turn around and pass that chunk at the output of the filter. This is kinda sorta like a "streaming copy". You don't ever have copied more than a single chunks worth of your array at any moment so its better than #3 (which is a full copy of the array), but its also a bit klugey. And, I haven't given any though to how you would do the read back either. I'm just assuming its possible. If you go the datatype route, then you would define a custom datatype for (probably each instance of such an object) and then also register your own data conversion routine (see https://www.hdfgroup.org/HDF5/doc/RM/RM_H5T.html#Datatype-Register) for it. It would work somewhat similarly I think and might even be a better way to go than a filter. However, I've never worked with that aspect of HDF5.

Hope that helps.

Mark

···

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of huebbe <nathanael.huebbe@informatik.uni-hamburg.de<mailto:nathanael.huebbe@informatik.uni-hamburg.de>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Monday, May 9, 2016 6:13 AM
To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: Re: [Hdf-forum] Dynamically allocated multidimensional arrays C++

Of course, you get garbage output: You are storing the array of pointers instead of the data,
along with whatever garbage happens to be after those pointers in memory.

Trouble is, C++ simply can't do true multidimensional arrays of dynamic size.
It's not part of the language. So you basically have two options:

1. Do the indexing yourself. Declare your multidimensional array as a 1D array, and access its elements
via `data[i*dims[1] + j]`. This is a nuisance, but still feasible.

2. Use C. C99 allows true multidimensional arrays of dynamic size. So, in C, you can just write

        double (*data)[dims[1]] = malloc(dims[0] * sizeof(*data));
        for ( size_t i = 0; i < dims[0]; ++i )
            for ( size_t j = 0; j < dims[1]; ++j )
                data[i][j] = i + j;

This will layout your data in memory the way HDF5 expects it, but it's not legal C++ code of any standard.

Of course, you can also use your pointer array, and read/write the data line by line. Or you can allocate
your data as a 1D array and alias it with a pointer array to be able to access it via `data[i][j]`.
But either way, it gets dirty.

Cheers,
Nathanael Hübbe

On 05/06/2016 10:29 PM, Steven Walton wrote:
So I am noticing some interesting behavior and is wondering if there is a way around this.
I am able so assign a rank 1 array dynamically and write this to an hdf5 filetype but I do not seem to be able to do with with higher order arrays. I would like to be able to write a PPx array to h5 and retain the data integrity. More specifically I am trying to create a easy to use vector to array library <https://github.com/stevenwalton/H5Easy> that can handle multidimensional data (works with rank 1).
Let me give some examples. I will also show the typenames of the arrays.
Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d
What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
    a[i] = new double[num2];
// typename PPd
Testing the saved arrays with h5dump (and loading and reading directly) I find that if I have typename PPx (not necessarily double) I get garbage stored. Here is an example code and output from h5dump showing the behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;
#define FILE "multi.h5"
int main()
{
   hsize_t dims[2];
   herr_t status;
   H5File file(FILE, H5F_ACC_TRUNC);
   dims[0] = 4;
   dims[1] = 6;
   double **data = new double*[dims[0]];
   for ( size_t i = 0; i < dims[0]; ++i )
     data[i] = new double[dims[1]];
   for ( size_t i = 0; i < dims[0]; ++i )
     for ( size_t j = 0; j < dims[1]; ++j )
       data[i][j] = i + j;
   DataSpace dataspace = DataSpace(2,dims);
   DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE, dataspace ) );
   dataset.write(data, PredType::IEEE_F64LE);
   dataset.close();
   dataspace.close();
   file.close();

   return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
    DATASET "test" {
       DATATYPE H5T_IEEE_F64LE
       DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
       DATA {
       (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
       (0,5): 3.21143e-322,
       (1,0): 0, 1, 2, 3, 4, 5,
       (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
       (3,0): 5, 6, 0, 3.21143e-322, 2, 3
       }
    }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last character which is the first number of the actual array), (0,5) is out of bounds, and has garbage data. (1,0) has always contained real data (though it should be located at (0,0)). So this seems like some addressing problem.
Is this a bug in the h5 libraries that allows me to read and write Pd data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some reason intentional? As using new is a fairly standard way to assign arrays, making P...Pt type data common, I have a hard time seeing this as intentional. In the mean time is anyone aware of a workaround to this? The data I am taking in will be dynamically allocated so I do not see a way to get Ax_... type data.
Thank you,
Steven
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

steven_walton · May 17, 2016, 2:01am

I actually just created a buffer array and then passed back to a vector.
But what I'm saying is that this is an extremely common way to store data
in C++, if not the default way for most users. Vectors are extremely common
as well. Why are we still held back by using C type arrays?

···

On Mon, May 9, 2016 at 12:58 PM, David <list@aue.org> wrote:

Hi Steve,

boost::multi_array provides a clean interface for multi dimensional arrays
in C++.

You can also do something like this:

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

On Fri, May 6, 2016 at 1:29 PM, Steven Walton <walton.stevenj@gmail.com> > wrote:

So I am noticing some interesting behavior and is wondering if there is a
way around this.
I am able so assign a rank 1 array dynamically and write this to an hdf5
filetype but I do not seem to be able to do with with higher order arrays.
I would like to be able to write a PPx array to h5 and retain the data
integrity. More specifically I am trying to create a easy to use vector to
array library <https://github.com/stevenwalton/H5Easy> that can handle
multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading directly) I
find that if I have typename PPx (not necessarily double) I get garbage
stored. Here is an example code and output from h5dump showing the
behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last
character which is the first number of the actual array), (0,5) is out of
bounds, and has garbage data. (1,0) has always contained real data (though
it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd
data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

David1 · May 17, 2016, 2:30am

The limitation you ran into in your original example was that std::vector
is only for 1 dimensional arrays and you wanted to use multi dimensional
indexing. C++ does not have a built in class for heap allocated multi dim
arrays. However you can simply use boost::multi_array. The standards
committee wisely has been using boost as a testing ground for new library
features, allowing them to mature and prove themselves before being
canonized. I will not be surprised if boost::multi_array is adopted into a
future standard.

David

···

On Mon, May 16, 2016 at 7:01 PM, Steven Walton <walton.stevenj@gmail.com> wrote:

I actually just created a buffer array and then passed back to a vector.
But what I'm saying is that this is an extremely common way to store data
in C++, if not the default way for most users. Vectors are extremely common
as well. Why are we still held back by using C type arrays?

On Mon, May 9, 2016 at 12:58 PM, David <list@aue.org> wrote:

Hi Steve,

boost::multi_array provides a clean interface for multi dimensional
arrays in C++.

You can also do something like this:

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

On Fri, May 6, 2016 at 1:29 PM, Steven Walton <walton.stevenj@gmail.com> >> wrote:

So I am noticing some interesting behavior and is wondering if there is
a way around this.
I am able so assign a rank 1 array dynamically and write this to an hdf5
filetype but I do not seem to be able to do with with higher order arrays.
I would like to be able to write a PPx array to h5 and retain the data
integrity. More specifically I am trying to create a easy to use vector to
array library <https://github.com/stevenwalton/H5Easy> that can handle
multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading directly)
I find that if I have typename PPx (not necessarily double) I get garbage
stored. Here is an example code and output from h5dump showing the
behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last
character which is the first number of the actual array), (0,5) is out of
bounds, and has garbage data. (1,0) has always contained real data (though
it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd
data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

steven_walton · May 17, 2016, 3:31pm

I guess I am just surprised that hdf5 does not accept the standard c++
"array" creation. New allocates to the heap. It may be pointer, but this is
pretty standard and they teach it in all c++ classes. New has existed for
quite some time. I just thought the hdf5 group would follow these standard
types.

I understand that vector is a 1D array. My original example was about new
and allocating to the heap, with standard c++.

···

On Mon, May 16, 2016 at 9:30 PM, David <list@aue.org> wrote:

The limitation you ran into in your original example was that std::vector
is only for 1 dimensional arrays and you wanted to use multi dimensional
indexing. C++ does not have a built in class for heap allocated multi dim
arrays. However you can simply use boost::multi_array. The standards
committee wisely has been using boost as a testing ground for new library
features, allowing them to mature and prove themselves before being
canonized. I will not be surprised if boost::multi_array is adopted into a
future standard.

David

On Mon, May 16, 2016 at 7:01 PM, Steven Walton <walton.stevenj@gmail.com> > wrote:

I actually just created a buffer array and then passed back to a vector.
But what I'm saying is that this is an extremely common way to store data
in C++, if not the default way for most users. Vectors are extremely common
as well. Why are we still held back by using C type arrays?

On Mon, May 9, 2016 at 12:58 PM, David <list@aue.org> wrote:

Hi Steve,

boost::multi_array provides a clean interface for multi dimensional
arrays in C++.

You can also do something like this:

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

On Fri, May 6, 2016 at 1:29 PM, Steven Walton <walton.stevenj@gmail.com> >>> wrote:

So I am noticing some interesting behavior and is wondering if there is
a way around this.
I am able so assign a rank 1 array dynamically and write this to an
hdf5 filetype but I do not seem to be able to do with with higher order
arrays. I would like to be able to write a PPx array to h5 and retain the
data integrity. More specifically I am trying to create a easy to use
vector to array library <https://github.com/stevenwalton/H5Easy> that
can handle multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading directly)
I find that if I have typename PPx (not necessarily double) I get garbage
stored. Here is an example code and output from h5dump showing the
behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last
character which is the first number of the actual array), (0,5) is out of
bounds, and has garbage data. (1,0) has always contained real data (though
it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd
data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

David1 · May 17, 2016, 4:09pm

My original example was about new and allocating to the heap

I remembered incorrectly, sorry. HDF5 works fine with 'new'. You were
trying to to use the heap and have multi dim indexing. The problem was that
you made separate allocations for each row and then were writing the row
pointers instead of the data. To write data with a single call to HD5Write
the data must allocated with a single call to 'new'.

Here's my sample code from before showing how to do what you were
attempting, using 'new' and getting multi dim indexing. You will give
'data' to HDF5 for writing and use 'md_data' for the indexing behavior.
This could be made into a class to encapsulate the behavior for easy reuse.

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

H5DWrite(... , data);

David

···

On Tue, May 17, 2016 at 8:31 AM, Steven Walton <walton.stevenj@gmail.com> wrote:

I guess I am just surprised that hdf5 does not accept the standard c++
"array" creation. New allocates to the heap. It may be pointer, but this is
pretty standard and they teach it in all c++ classes. New has existed for
quite some time. I just thought the hdf5 group would follow these standard
types.

I understand that vector is a 1D array. My original example was about new
and allocating to the heap, with standard c++.

On Mon, May 16, 2016 at 9:30 PM, David <list@aue.org> wrote:

The limitation you ran into in your original example was that std::vector
is only for 1 dimensional arrays and you wanted to use multi dimensional
indexing. C++ does not have a built in class for heap allocated multi dim
arrays. However you can simply use boost::multi_array. The standards
committee wisely has been using boost as a testing ground for new library
features, allowing them to mature and prove themselves before being
canonized. I will not be surprised if boost::multi_array is adopted into a
future standard.

David

On Mon, May 16, 2016 at 7:01 PM, Steven Walton <walton.stevenj@gmail.com> >> wrote:

I actually just created a buffer array and then passed back to a vector.
But what I'm saying is that this is an extremely common way to store data
in C++, if not the default way for most users. Vectors are extremely common
as well. Why are we still held back by using C type arrays?

On Mon, May 9, 2016 at 12:58 PM, David <list@aue.org> wrote:

Hi Steve,

boost::multi_array provides a clean interface for multi dimensional
arrays in C++.

You can also do something like this:

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

On Fri, May 6, 2016 at 1:29 PM, Steven Walton <walton.stevenj@gmail.com >>>> > wrote:

So I am noticing some interesting behavior and is wondering if there
is a way around this.
I am able so assign a rank 1 array dynamically and write this to an
hdf5 filetype but I do not seem to be able to do with with higher order
arrays. I would like to be able to write a PPx array to h5 and retain the
data integrity. More specifically I am trying to create a easy to use
vector to array library <https://github.com/stevenwalton/H5Easy> that
can handle multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the
arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading
directly) I find that if I have typename PPx (not necessarily double) I get
garbage stored. Here is an example code and output from h5dump showing the
behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316, 0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last
character which is the first number of the actual array), (0,5) is out of
bounds, and has garbage data. (1,0) has always contained real data (though
it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd
data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

David1 · May 17, 2016, 4:27pm

With C++ you can eliminate the need for the row pointers. Here's a simple
2d heap array class. I just typed it out quickly so it might not be perfect
but it captures the gist of what you were doing.

template<typename T>
class TwoDArray
{
public:
typedef T value_type;
typedef T* pointer;

TwoDArray(size_t _rows, size_t _cols) : rows_(_rows), cols_(_cols),
data_(new value_type[_rows*_cols]) {}
~TwoDArray() { delete data_; }

pointer operator(size_t row) { return data_ + row*cols_; }

    size_t rows() const { return rows_; }
    size_t cols() const { return cols_; }
    pointer data() { return data_; }

private:
    size_t rows_;
    size_t cols_;
    pointer data_;
};

void f()
{
TwoDArray<double> x(3, 5);
x[1][4] = 5.1;

H5DWrite(... x.data());
}

···

On Tue, May 17, 2016 at 9:09 AM, David <list@aue.org> wrote:

> My original example was about new and allocating to the heap

I remembered incorrectly, sorry. HDF5 works fine with 'new'. You were
trying to to use the heap and have multi dim indexing. The problem was that
you made separate allocations for each row and then were writing the row
pointers instead of the data. To write data with a single call to HD5Write
the data must allocated with a single call to 'new'.

Here's my sample code from before showing how to do what you were
attempting, using 'new' and getting multi dim indexing. You will give
'data' to HDF5 for writing and use 'md_data' for the indexing behavior.
This could be made into a class to encapsulate the behavior for easy reuse.

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

H5DWrite(... , data);

David

On Tue, May 17, 2016 at 8:31 AM, Steven Walton <walton.stevenj@gmail.com> > wrote:

I guess I am just surprised that hdf5 does not accept the standard c++
"array" creation. New allocates to the heap. It may be pointer, but this is
pretty standard and they teach it in all c++ classes. New has existed for
quite some time. I just thought the hdf5 group would follow these standard
types.

I understand that vector is a 1D array. My original example was about new
and allocating to the heap, with standard c++.

On Mon, May 16, 2016 at 9:30 PM, David <list@aue.org> wrote:

The limitation you ran into in your original example was that
std::vector is only for 1 dimensional arrays and you wanted to use multi
dimensional indexing. C++ does not have a built in class for heap allocated
multi dim arrays. However you can simply use boost::multi_array. The
standards committee wisely has been using boost as a testing ground for new
library features, allowing them to mature and prove themselves before being
canonized. I will not be surprised if boost::multi_array is adopted into a
future standard.

David

On Mon, May 16, 2016 at 7:01 PM, Steven Walton <walton.stevenj@gmail.com >>> > wrote:

I actually just created a buffer array and then passed back to a
vector. But what I'm saying is that this is an extremely common way to
store data in C++, if not the default way for most users. Vectors are
extremely common as well. Why are we still held back by using C type arrays?

On Mon, May 9, 2016 at 12:58 PM, David <list@aue.org> wrote:

Hi Steve,

boost::multi_array provides a clean interface for multi dimensional
arrays in C++.

You can also do something like this:

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

On Fri, May 6, 2016 at 1:29 PM, Steven Walton < >>>>> walton.stevenj@gmail.com> wrote:

So I am noticing some interesting behavior and is wondering if there
is a way around this.
I am able so assign a rank 1 array dynamically and write this to an
hdf5 filetype but I do not seem to be able to do with with higher order
arrays. I would like to be able to write a PPx array to h5 and retain the
data integrity. More specifically I am trying to create a easy to use
vector to array library <https://github.com/stevenwalton/H5Easy>
that can handle multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the
arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading
directly) I find that if I have typename PPx (not necessarily double) I get
garbage stored. Here is an example code and output from h5dump showing the
behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316,
0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last
character which is the first number of the actual array), (0,5) is out of
bounds, and has garbage data. (1,0) has always contained real data (though
it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd
data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

steven_walton · May 17, 2016, 4:29pm

Thanks, that cleared things up. Solution is working better than mine.

···

On Tue, May 17, 2016 at 11:09 AM, David <list@aue.org> wrote:

> My original example was about new and allocating to the heap

I remembered incorrectly, sorry. HDF5 works fine with 'new'. You were
trying to to use the heap and have multi dim indexing. The problem was that
you made separate allocations for each row and then were writing the row
pointers instead of the data. To write data with a single call to HD5Write
the data must allocated with a single call to 'new'.

Here's my sample code from before showing how to do what you were
attempting, using 'new' and getting multi dim indexing. You will give
'data' to HDF5 for writing and use 'md_data' for the indexing behavior.
This could be made into a class to encapsulate the behavior for easy reuse.

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

H5DWrite(... , data);

David

On Tue, May 17, 2016 at 8:31 AM, Steven Walton <walton.stevenj@gmail.com> > wrote:

I guess I am just surprised that hdf5 does not accept the standard c++
"array" creation. New allocates to the heap. It may be pointer, but this is
pretty standard and they teach it in all c++ classes. New has existed for
quite some time. I just thought the hdf5 group would follow these standard
types.

I understand that vector is a 1D array. My original example was about new
and allocating to the heap, with standard c++.

On Mon, May 16, 2016 at 9:30 PM, David <list@aue.org> wrote:

The limitation you ran into in your original example was that
std::vector is only for 1 dimensional arrays and you wanted to use multi
dimensional indexing. C++ does not have a built in class for heap allocated
multi dim arrays. However you can simply use boost::multi_array. The
standards committee wisely has been using boost as a testing ground for new
library features, allowing them to mature and prove themselves before being
canonized. I will not be surprised if boost::multi_array is adopted into a
future standard.

David

On Mon, May 16, 2016 at 7:01 PM, Steven Walton <walton.stevenj@gmail.com >>> > wrote:

I actually just created a buffer array and then passed back to a
vector. But what I'm saying is that this is an extremely common way to
store data in C++, if not the default way for most users. Vectors are
extremely common as well. Why are we still held back by using C type arrays?

On Mon, May 9, 2016 at 12:58 PM, David <list@aue.org> wrote:

Hi Steve,

boost::multi_array provides a clean interface for multi dimensional
arrays in C++.

You can also do something like this:

auto data = new double[rows*cols]; // allocate all data in one block
auto md_data = new double*[rows]; // allocate pointers for each row
for (int r = 0; r != rows; ++r) // set row pointers
    md_data[r] = data + r*cols;
md_data[2][5] = 1.0; // row pointer array can be used as a pseudo md array

On Fri, May 6, 2016 at 1:29 PM, Steven Walton < >>>>> walton.stevenj@gmail.com> wrote:

So I am noticing some interesting behavior and is wondering if there
is a way around this.
I am able so assign a rank 1 array dynamically and write this to an
hdf5 filetype but I do not seem to be able to do with with higher order
arrays. I would like to be able to write a PPx array to h5 and retain the
data integrity. More specifically I am trying to create a easy to use
vector to array library <https://github.com/stevenwalton/H5Easy>
that can handle multidimensional data (works with rank 1).

Let me give some examples. I will also show the typenames of the
arrays.

Works:
double *a = new double[numPts]; // typename: Pd
double a[numPts]; // typename A#pts_d
double a[num1][num2]; typename:Anum1_Anum2_d

What doesn't work:
double **a = new double*[num1];
for ( size_t i = 0; i < num1; ++i )
   a[i] = new double[num2];
// typename PPd

Testing the saved arrays with h5dump (and loading and reading
directly) I find that if I have typename PPx (not necessarily double) I get
garbage stored. Here is an example code and output from h5dump showing the
behavior.
------------------------------------------------------------
compiled with h5c++ -std=c++11
------------------------------------------------------------
#include "H5Cpp.h"
using namespace H5;

#define FILE "multi.h5"

int main()
{
  hsize_t dims[2];
  herr_t status;
  H5File file(FILE, H5F_ACC_TRUNC);
  dims[0] = 4;
  dims[1] = 6;

  double **data = new double*[dims[0]];
  for ( size_t i = 0; i < dims[0]; ++i )
    data[i] = new double[dims[1]];

  for ( size_t i = 0; i < dims[0]; ++i )
    for ( size_t j = 0; j < dims[1]; ++j )
      data[i][j] = i + j;

  DataSpace dataspace = DataSpace(2,dims);
  DataSet dataset( file.createDataSet( "test", PredType::IEEE_F64LE,
dataspace ) );
  dataset.write(data, PredType::IEEE_F64LE);
  dataset.close();
  dataspace.close();
  file.close();

  return 0;
}
------------------------------------------------------------
h5dump
------------------------------------------------------------
HDF5 "multi.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE H5T_IEEE_F64LE
      DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
      DATA {
      (0,0): 1.86018e-316, 1.86018e-316, 1.86018e-316, 1.86019e-316,
0,
      (0,5): 3.21143e-322,
      (1,0): 0, 1, 2, 3, 4, 5,
      (2,0): 0, 3.21143e-322, 1, 2, 3, 4,
      (3,0): 5, 6, 0, 3.21143e-322, 2, 3
      }
   }
}
}
------------------------------------------------------------------
As can be seen the (0,0) set is absolute garbage (except the last
character which is the first number of the actual array), (0,5) is out of
bounds, and has garbage data. (1,0) has always contained real data (though
it should be located at (0,0)). So this seems like some addressing problem.

Is this a bug in the h5 libraries that allows me to read and write Pd
data as well as Ax0_...Axn_t data but not P...Pt data? Or is this for some
reason intentional? As using new is a fairly standard way to assign arrays,
making P...Pt type data common, I have a hard time seeing this as
intentional. In the mean time is anyone aware of a workaround to this? The
data I am taking in will be dynamically allocated so I do not see a way to
get Ax_... type data.

Thank you,
Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

Dynamically allocated multidimensional arrays C++