array in compound datatype

Hi Folks -
I've made progress on my "iteratively writing to compound datatype"
project, but have a new question. Let me first summarize my situation. I am
working on an application that reads messages in a variety of formats and
presents their data to users. I am experimenting with creating compound
datatypes to represent messages. As I'm reading messages, when I hit a new
type, I essentially read it twice: the first time to build the datatype,
the second time to write the data.

Some of the messages contain arrays of values, and I'm not getting more
than one element to show up in the output file. I'm hoping that if I
describe what I'm doing, someone can spot my mistake (or mistakes).

As I go through the message, when I hit an array (let's say that its name
is *arrayField*), I create a slot for it in my compound datatype:
*HT5insert(compoundType, "arrayField", currentOffset, dataType)*.
I then increase currentOffset by the total size of the array (e.g. 3x4
array of int means *currentOffset += 12*sizeof(int)*).

I then skip over the remaining values in the array as I continue reading
through the message.

Once I've finished building the datatype, I start writing the values. When
I hit *arrayField*, I set up an *offset *array, initialized to 0s. I then
go through the following sequence:
*valueDT = H5Tcreate(H5T_COMPOUND, dtSize);* // creating in-memory datatype
for value
*H5Tinsert(valueDT, "arrayField", 0, dataType);*
*filespace = H5Dget_space(dataSet);*
*hsize_t* count = new hsize_t[dimensions+1];* // +1 for datatype itself
*hsize_t* stride = **new hsize_t[dimensions+1];*
*hsize_t* block = **new hsize_t[dimensions+1];

···

*
// fill in values for count, stride and block.
// count gets [1, dim1, dim2],
// stride and block get [1, 1, 1]
*H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, count, stride,
block);*
*H5Dwrite(dataSet, valueDT, dataSpace, filespace, H5P_DEFAULT, &value);*

The next value gets read, and I use "odometer" logic on *offset*.* *In* *this
case, the last value in the array goes from 0 to 1. The above steps then
get repeated.

I'm most suspicious of my settings for *count*, *stride* and *block*. That
said, I'm not overly confident about any other aspect of my approach :slight_smile:

-Josiah

I've figured out one of my problems; I didn't use *H5Tarray_create2* to
make the *dataType* argument to *H5Tinsert*.

···

On Thu, Jul 25, 2013 at 1:33 PM, Josiah Slack <josiahnmi@gmail.com> wrote:

Hi Folks -
I've made progress on my "iteratively writing to compound datatype"
project, but have a new question. Let me first summarize my situation. I am
working on an application that reads messages in a variety of formats and
presents their data to users. I am experimenting with creating compound
datatypes to represent messages. As I'm reading messages, when I hit a new
type, I essentially read it twice: the first time to build the datatype,
the second time to write the data.

Some of the messages contain arrays of values, and I'm not getting more
than one element to show up in the output file. I'm hoping that if I
describe what I'm doing, someone can spot my mistake (or mistakes).

As I go through the message, when I hit an array (let's say that its name
is *arrayField*), I create a slot for it in my compound datatype:
*HT5insert(compoundType, "arrayField", currentOffset, dataType)*.
I then increase currentOffset by the total size of the array (e.g. 3x4
array of int means *currentOffset += 12*sizeof(int)*).

I then skip over the remaining values in the array as I continue reading
through the message.

Once I've finished building the datatype, I start writing the values. When
I hit *arrayField*, I set up an *offset *array, initialized to 0s. I then
go through the following sequence:
*valueDT = H5Tcreate(H5T_COMPOUND, dtSize);* // creating in-memory
datatype for value
*H5Tinsert(valueDT, "arrayField", 0, dataType);*
*filespace = H5Dget_space(dataSet);*
*hsize_t* count = new hsize_t[dimensions+1];* // +1 for datatype itself
*hsize_t* stride = **new hsize_t[dimensions+1];*
*hsize_t* block = **new hsize_t[dimensions+1];
*
// fill in values for count, stride and block.
// count gets [1, dim1, dim2],
// stride and block get [1, 1, 1]
*H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, count, stride,
block);*
*H5Dwrite(dataSet, valueDT, dataSpace, filespace, H5P_DEFAULT, &value);*

The next value gets read, and I use "odometer" logic on *offset*.* *In* *this
case, the last value in the array goes from 0 to 1. The above steps then
get repeated.

I'm most suspicious of my settings for *count*, *stride* and *block*.
That said, I'm not overly confident about any other aspect of my approach :slight_smile:

-Josiah

I've provisionally concluded that I was wasting my time with the hyperslab
bookkeeping I was attempting. I've got an approach that works, but I don't
like it very much.

Basically, I allocate a full-sized array (that is, it's big enough to hold
all the values for the array field in the compound type), do an *H5Dread*,
copying the current values of the field into the array. I then insert the
new value into the array, and after that do the *H5Dwrite*. I'd like to
move away from all the wasteful copying involved - is there some common
idiom for selecting part of an array to be written (bearing in mind that
the array is part of a compound datatype)?

···

On Fri, Jul 26, 2013 at 10:07 AM, Josiah Slack <josiahnmi@gmail.com> wrote:

I've figured out one of my problems; I didn't use *H5Tarray_create2* to
make the *dataType* argument to *H5Tinsert*.

On Thu, Jul 25, 2013 at 1:33 PM, Josiah Slack <josiahnmi@gmail.com> wrote:

Hi Folks -
I've made progress on my "iteratively writing to compound datatype"
project, but have a new question. Let me first summarize my situation. I am
working on an application that reads messages in a variety of formats and
presents their data to users. I am experimenting with creating compound
datatypes to represent messages. As I'm reading messages, when I hit a new
type, I essentially read it twice: the first time to build the datatype,
the second time to write the data.

Some of the messages contain arrays of values, and I'm not getting more
than one element to show up in the output file. I'm hoping that if I
describe what I'm doing, someone can spot my mistake (or mistakes).

As I go through the message, when I hit an array (let's say that its name
is *arrayField*), I create a slot for it in my compound datatype:
*HT5insert(compoundType, "arrayField", currentOffset, dataType)*.
I then increase currentOffset by the total size of the array (e.g. 3x4
array of int means *currentOffset += 12*sizeof(int)*).

I then skip over the remaining values in the array as I continue reading
through the message.

Once I've finished building the datatype, I start writing the values.
When I hit *arrayField*, I set up an *offset *array, initialized to 0s.
I then go through the following sequence:
*valueDT = H5Tcreate(H5T_COMPOUND, dtSize);* // creating in-memory
datatype for value
*H5Tinsert(valueDT, "arrayField", 0, dataType);*
*filespace = H5Dget_space(dataSet);*
*hsize_t* count = new hsize_t[dimensions+1];* // +1 for datatype itself
*hsize_t* stride = **new hsize_t[dimensions+1];*
*hsize_t* block = **new hsize_t[dimensions+1];
*
// fill in values for count, stride and block.
// count gets [1, dim1, dim2],
// stride and block get [1, 1, 1]
*H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, count, stride,
block);*
*H5Dwrite(dataSet, valueDT, dataSpace, filespace, H5P_DEFAULT, &value);*

The next value gets read, and I use "odometer" logic on *offset*.* *In* *this
case, the last value in the array goes from 0 to 1. The above steps then
get repeated.

I'm most suspicious of my settings for *count*, *stride* and *block*.
That said, I'm not overly confident about any other aspect of my approach :slight_smile:

-Josiah

Josiah, how are you? Unless I misunderstand your question, I'm afraid that's
not possible. Let me rephrase your question: You have a dataset D of a compound type T
with components A, B, C and A happens to be an array datatype.
You'd like to write just parts of A.
The HDF5 library does not support that.
You can read or write only entire components of a data element of a compound type.
In other words, there's no partial I/O on data elements.
You could "outsource" the array component into another dataset (if the
sum of the ranks of the dataspace of D and of A does not exceed 32).

Best, G.

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Josiah Slack
Sent: Monday, July 29, 2013 1:14 PM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] array in compound datatype

I've provisionally concluded that I was wasting my time with the hyperslab bookkeeping I was attempting. I've got an approach that works, but I don't like it very much.
Basically, I allocate a full-sized array (that is, it's big enough to hold all the values for the array field in the compound type), do an H5Dread, copying the current values of the field into the array. I then insert the new value into the array, and after that do the H5Dwrite. I'd like to move away from all the wasteful copying involved - is there some common idiom for selecting part of an array to be written (bearing in mind that the array is part of a compound datatype)?

On Fri, Jul 26, 2013 at 10:07 AM, Josiah Slack <josiahnmi@gmail.com<mailto:josiahnmi@gmail.com>> wrote:
I've figured out one of my problems; I didn't use H5Tarray_create2 to make the dataType argument to H5Tinsert.

On Thu, Jul 25, 2013 at 1:33 PM, Josiah Slack <josiahnmi@gmail.com<mailto:josiahnmi@gmail.com>> wrote:
Hi Folks -
I've made progress on my "iteratively writing to compound datatype" project, but have a new question. Let me first summarize my situation. I am working on an application that reads messages in a variety of formats and presents their data to users. I am experimenting with creating compound datatypes to represent messages. As I'm reading messages, when I hit a new type, I essentially read it twice: the first time to build the datatype, the second time to write the data.
Some of the messages contain arrays of values, and I'm not getting more than one element to show up in the output file. I'm hoping that if I describe what I'm doing, someone can spot my mistake (or mistakes).
As I go through the message, when I hit an array (let's say that its name is arrayField), I create a slot for it in my compound datatype:
HT5insert(compoundType, "arrayField", currentOffset, dataType).
I then increase currentOffset by the total size of the array (e.g. 3x4 array of int means currentOffset += 12*sizeof(int)).
I then skip over the remaining values in the array as I continue reading through the message.
Once I've finished building the datatype, I start writing the values. When I hit arrayField, I set up an offset array, initialized to 0s. I then go through the following sequence:
valueDT = H5Tcreate(H5T_COMPOUND, dtSize); // creating in-memory datatype for value
H5Tinsert(valueDT, "arrayField", 0, dataType);
filespace = H5Dget_space(dataSet);
hsize_t* count = new hsize_t[dimensions+1]; // +1 for datatype itself
hsize_t* stride = new hsize_t[dimensions+1];
hsize_t* block = new hsize_t[dimensions+1];
// fill in values for count, stride and block.
// count gets [1, dim1, dim2],
// stride and block get [1, 1, 1]
H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, count, stride, block);
H5Dwrite(dataSet, valueDT, dataSpace, filespace, H5P_DEFAULT, &value);
The next value gets read, and I use "odometer" logic on offset. In this case, the last value in the array goes from 0 to 1. The above steps then get repeated.
I'm most suspicious of my settings for count, stride and block. That said, I'm not overly confident about any other aspect of my approach :slight_smile:
-Josiah

Hi Gerd -
I'm making pretty good progress. I've got a pretty solid prototype at this
point, and I'm starting to refine it. Your paraphrase captures my intent,
so it looks like I need to rethink my design. But in the meantime, my
workaround is at least workable. Thanks again for all the help.

-Josiah

···

On Mon, Jul 29, 2013 at 2:37 PM, Gerd Heber <gheber@hdfgroup.org> wrote:

Josiah, how are you? Unless I misunderstand your question, I’m afraid
that’s****

not possible. Let me rephrase your question: You have a dataset D of a
compound type T****

with components A, B, C and A happens to be an array datatype.****

You’d like to write just parts of A.****

The HDF5 library does not support that.****

You can read or write only entire components of a data element of a
compound type.****

In other words, there’s no partial I/O on data elements.****

You could “outsource” the array component into another dataset (if the****

sum of the ranks of the dataspace of D and of A does not exceed 32).****

** **

Best, G.****

** **

** **

*From:* Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] *On
Behalf Of *Josiah Slack
*Sent:* Monday, July 29, 2013 1:14 PM
*To:* HDF Users Discussion List
*Subject:* Re: [Hdf-forum] array in compound datatype****

** **

I've provisionally concluded that I was wasting my time with the hyperslab
bookkeeping I was attempting. I've got an approach that works, but I don't
like it very much.****

Basically, I allocate a full-sized array (that is, it's big enough to hold
all the values for the array field in the compound type), do an *H5Dread*,
copying the current values of the field into the array. I then insert the
new value into the array, and after that do the *H5Dwrite*. I'd like to
move away from all the wasteful copying involved - is there some common
idiom for selecting part of an array to be written (bearing in mind that
the array is part of a compound datatype)?****

** **

On Fri, Jul 26, 2013 at 10:07 AM, Josiah Slack <josiahnmi@gmail.com>
wrote:****

I've figured out one of my problems; I didn't use *H5Tarray_create2* to
make the *dataType* argument to *H5Tinsert*.****

** **

** **

On Thu, Jul 25, 2013 at 1:33 PM, Josiah Slack <josiahnmi@gmail.com> wrote:
****

Hi Folks -****

I've made progress on my "iteratively writing to compound datatype"
project, but have a new question. Let me first summarize my situation. I am
working on an application that reads messages in a variety of formats and
presents their data to users. I am experimenting with creating compound
datatypes to represent messages. As I'm reading messages, when I hit a new
type, I essentially read it twice: the first time to build the datatype,
the second time to write the data.****

Some of the messages contain arrays of values, and I'm not getting more
than one element to show up in the output file. I'm hoping that if I
describe what I'm doing, someone can spot my mistake (or mistakes).****

As I go through the message, when I hit an array (let's say that its name
is *arrayField*), I create a slot for it in my compound datatype:
*HT5insert(compoundType, "arrayField", currentOffset, dataType)*.
I then increase currentOffset by the total size of the array (e.g. 3x4
array of int means *currentOffset += 12*sizeof(int)*). ****

I then skip over the remaining values in the array as I continue reading
through the message.****

Once I've finished building the datatype, I start writing the values. When
I hit *arrayField*, I set up an *offset *array, initialized to 0s. I then
go through the following sequence:****

*valueDT = H5Tcreate(H5T_COMPOUND, dtSize);* // creating in-memory
datatype for value****

*H5Tinsert(valueDT, "arrayField", 0, dataType);*****

*filespace = H5Dget_space(dataSet);*****

*hsize_t* count = new hsize_t[dimensions+1];* // +1 for datatype itself***
*

*hsize_t* stride = new hsize_t[dimensions+1];* ****

*hsize_t* block = new hsize_t[dimensions+1];*****

// fill in values for count, stride and block.****

// count gets [1, dim1, dim2],****

// stride and block get [1, 1, 1]****

*H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, count, stride,
block);*****

*H5Dwrite(dataSet, valueDT, dataSpace, filespace, H5P_DEFAULT, &value);***
**

The next value gets read, and I use "odometer" logic on *offset*.* *In* *this
case, the last value in the array goes from 0 to 1. The above steps then
get repeated.****

I'm most suspicious of my settings for *count*, *stride* and *block*.
That said, I'm not overly confident about any other aspect of my approach :slight_smile:
****

-Josiah****

** **

** **

** **

** **

** **

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org