Irregular 2D array write (C++)

So I'm wondering if there is a good way to write an irregular shaped 2D
array into hdf5. And example of this would be like storing vtk node
connections for an unstructured grid. First number noting the cell type and
the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length 4
(or arbitrary). I understand how to do this with C++ vectors and push_back,
but those don't create contiguous arrays. Is there another way to create
this in a way that HDF would accept?

-Steven

We actually do this exact operation for our project. What I decided to do was to first flatten from a "vector of vectors" into a single array, then write that array as a "normal" array to HDF5. Add an attribute to the HDF5 dataset to state what the array represents so that when you read it back from HDF5 you know that you will need to recreate your own data structure.

Another way that we tackled the issue was to write the data into a contiguous array and then write each of the "length" values into another contiguous array. The write the data to an HDF5 data set and the "Length" array as an attribute array of the value array. I can send a link to this implementation if you want. With your data you would end up with the following arrays:

Value:23 41 54 12 46 29 19 60 93 18 58 29 58 17 50 38 58 95
Length:9 9 5 5 9

There are pros and cons to do it either of these ways.

···

--
Michael A. Jackson
BlueQuartz Software, LLC
[e]: mike.jackson@bluequartz.net

Steven Walton wrote:

So I'm wondering if there is a good way to write an irregular shaped 2D
array into hdf5. And example of this would be like storing vtk node
connections for an unstructured grid. First number noting the cell type
and the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length
4 (or arbitrary). I understand how to do this with C++ vectors and
push_back, but those don't create contiguous arrays. Is there another
way to create this in a way that HDF would accept?

-Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Hi Steven,

Would variable-length datatype do what you need?

Binh-Minh

···

________________________________
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org> on behalf of Steven Walton <walton.stevenj@gmail.com>
Sent: Monday, October 17, 2016 12:34 PM
To: HDF Users Discussion List
Subject: [Hdf-forum] Irregular 2D array write (C++)

So I'm wondering if there is a good way to write an irregular shaped 2D array into hdf5. And example of this would be like storing vtk node connections for an unstructured grid. First number noting the cell type and the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length 4 (or arbitrary). I understand how to do this with C++ vectors and push_back, but those don't create contiguous arrays. Is there another way to create this in a way that HDF would accept?

-Steven

Hi Steven,

Hmm. I am thinking maybe a few approaches.

  1. Use "unlimited" dimensions for those dimensions that have variable size, https://support.hdfgroup.org/HDF5/doc/RM/RM_H5S.html#Dataspace-ExtentDims
  2. Define row-dim to be max of all your rows and then se row-oriented chunking, https://support.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetChunk. Longer rows use more chunks. Shorter rows use fewer
  3. Maybe you could do this with 'virtual datasets' where each row is a separate HDF5 dataset and another, virtual dataset, knits them all together, https://support.hdfgroup.org/HDF5/docNewFeatures/NewFeaturesVirtualDatasetDocs.html

Hope that helps.

Mark

···

--
Mark C. Miller, LLNL

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Steven Walton <walton.stevenj@gmail.com<mailto:walton.stevenj@gmail.com>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Monday, October 17, 2016 at 9:34 AM
To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: [Hdf-forum] Irregular 2D array write (C++)

So I'm wondering if there is a good way to write an irregular shaped 2D array into hdf5. And example of this would be like storing vtk node connections for an unstructured grid. First number noting the cell type and the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length 4 (or arbitrary). I understand how to do this with C++ vectors and push_back, but those don't create contiguous arrays. Is there another way to create this in a way that HDF would accept?

-Steven

Keep in mind that if you go with a variable-length type, the data cannot be compressed so your file sizes will be larger.

Dana Robinson
Software Engineer
The HDF Group

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Steven Walton
Sent: Monday, October 17, 2016 12:35 PM
To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org>
Subject: [Hdf-forum] Irregular 2D array write (C++)

So I'm wondering if there is a good way to write an irregular shaped 2D array into hdf5. And example of this would be like storing vtk node connections for an unstructured grid. First number noting the cell type and the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95
So the array has some rows that are length 5 and others that are length 4 (or arbitrary). I understand how to do this with C++ vectors and push_back, but those don't create contiguous arrays. Is there another way to create this in a way that HDF would accept?
-Steven

Your corresponding C++ type for this array would be

std::vector<std::vector<int>>

such as for storing cells in mixed meshes, like quads and triangles mixed?

If so, you need a variable-length array data type of type integer, i.e., HDF5 will see it as

vector<hvl_t>

It's not overly efficient to use variable-length data types, so it would be better to sort the cells into groups 3-element and 4-elements, and save that as two datasets of constant-length data type, but if the size of elements must be mixed in one dataset, then a variable length data type will be a direct match of that structure.

                 Werner

···

On 17.10.2016 18:34, Steven Walton wrote:

So I'm wondering if there is a good way to write an irregular shaped 2D array into hdf5. And example of this would be like storing vtk node connections for an unstructured grid. First number noting the cell type and the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length 4 (or arbitrary). I understand how to do this with C++ vectors and push_back, but those don't create contiguous arrays. Is there another way to create this in a way that HDF would accept?

-Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Center for Computation & Technology at Louisiana State University (CCT/LSU)
2019 Digital Media Center, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

In this case I need integers, but I would prefer how to know how to do it
for an arbitrary type. Is there going to be a difference?

···

On Mon, Oct 17, 2016 at 11:39 AM, Binh-Minh Ribler <bmribler@hdfgroup.org> wrote:

Hi Steven,

Would variable-length datatype do what you need?

Binh-Minh

------------------------------
*From:* Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org> on behalf of
Steven Walton <walton.stevenj@gmail.com>
*Sent:* Monday, October 17, 2016 12:34 PM
*To:* HDF Users Discussion List
*Subject:* [Hdf-forum] Irregular 2D array write (C++)

So I'm wondering if there is a good way to write an irregular shaped 2D
array into hdf5. And example of this would be like storing vtk node
connections for an unstructured grid. First number noting the cell type and
the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length 4
(or arbitrary). I understand how to do this with C++ vectors and push_back,
but those don't create contiguous arrays. Is there another way to create
this in a way that HDF would accept?

-Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Sorry....gave you the wrong reference regarding unlimited dimensions. Here is correct one, https://support.hdfgroup.org/HDF5/doc/RM/RM_H5S.html#Dataspace-CreateSimple

Also, in reading that, using unlimited dimensions *also*requires* chunking too.

But, my suggestion #2 replaces use of unlimited dimensions with a max and relies upon ability to set chunk size such that wasted space (due to partially written chunks if any) might be small. Note, HDF5 will only ever allocate space in the file for chunks that are actually written. So, you can have a very large, sparse 2D array (like a banded matrix or something) that really doesn't take up that much space on disk.

Mark

···

--
Mark C. Miller, LLNL

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of "Miller, Mark C." <miller86@llnl.gov<mailto:miller86@llnl.gov>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Monday, October 17, 2016 at 9:47 AM
To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: Re: [Hdf-forum] Irregular 2D array write (C++)

Hi Steven,

Hmm. I am thinking maybe a few approaches.

  1. Use "unlimited" dimensions for those dimensions that have variable size, https://support.hdfgroup.org/HDF5/doc/RM/RM_H5S.html#Dataspace-ExtentDims
  2. Define row-dim to be max of all your rows and then se row-oriented chunking, https://support.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetChunk. Longer rows use more chunks. Shorter rows use fewer
  3. Maybe you could do this with 'virtual datasets' where each row is a separate HDF5 dataset and another, virtual dataset, knits them all together, https://support.hdfgroup.org/HDF5/docNewFeatures/NewFeaturesVirtualDatasetDocs.html

Hope that helps.

Mark

--
Mark C. Miller, LLNL

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Steven Walton <walton.stevenj@gmail.com<mailto:walton.stevenj@gmail.com>>
Reply-To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Date: Monday, October 17, 2016 at 9:34 AM
To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Subject: [Hdf-forum] Irregular 2D array write (C++)

So I'm wondering if there is a good way to write an irregular shaped 2D array into hdf5. And example of this would be like storing vtk node connections for an unstructured grid. First number noting the cell type and the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length 4 (or arbitrary). I understand how to do this with C++ vectors and push_back, but those don't create contiguous arrays. Is there another way to create this in a way that HDF would accept?

-Steven

Hmm, I'm not sure... Could you use the combination of H5Tvlen_create and array or compound datatype?

···

________________________________

From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org> on behalf of Steven Walton <walton.stevenj@gmail.com>
Sent: Monday, October 17, 2016 12:42 PM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] Irregular 2D array write (C++)

In this case I need integers, but I would prefer how to know how to do it for an arbitrary type. Is there going to be a difference?

On Mon, Oct 17, 2016 at 11:39 AM, Binh-Minh Ribler <bmribler@hdfgroup.org<mailto:bmribler@hdfgroup.org>> wrote:

Hi Steven,

Would variable-length datatype do what you need?

Binh-Minh

________________________________
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Steven Walton <walton.stevenj@gmail.com<mailto:walton.stevenj@gmail.com>>
Sent: Monday, October 17, 2016 12:34 PM
To: HDF Users Discussion List
Subject: [Hdf-forum] Irregular 2D array write (C++)

So I'm wondering if there is a good way to write an irregular shaped 2D array into hdf5. And example of this would be like storing vtk node connections for an unstructured grid. First number noting the cell type and the next numbers denoting the nodes.

9 23 41 54 12
9 46 29 19 60
5 93 18 58
5 29 58 17
9 50 38 58 95

So the array has some rows that are length 5 and others that are length 4 (or arbitrary). I understand how to do this with C++ vectors and push_back, but those don't create contiguous arrays. Is there another way to create this in a way that HDF would accept?

-Steven

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5