Variable length compound data?

Anal_K_Patel · September 2, 2008, 2:44pm

VERSION:
HDF5 release 1.8

DESCRIPTION OF QUESTION:
We have a nested structure to write data. The inner
structure can be array of any length. It is not known ahead of time during
the creation of datatype the length or dimension of this struct. As a
matter of fact outer structure array of length is also not known. But that
I believe could be managed by chunking to the size of one struct, and then
extending it each time a new outer struct has to be written.
for example:
typedef struct sample1 {
int val4;
}sample1;
typedef struct sample {
int val1;
char val2;
sample1 s1[]; // How many records of s1 are there is not known upfront
double val3;
} sample;

sample spl[100];
spl[1] can have 20 s1's; spl[2] can have 50 s1's, ...

[Note: Although the data gets originated from the struct example above,
the module writing HDF has no knowledge of defined C/C++ struct which may
reside into header file of some other program. However, the hdf module
does get information like STRUCT_SAMPLE1_START, INT
VAL4,<data>,STRUCT_SAMPLE1_END, etc.]

Is there anyway, we can have variable length of compound data type. There
will be one dataset wherein 100 records of struct sample will be stored.
The length of the sample may vary depending on number of s1 records it may
contain. The max value of s1 as an array of structs is also not known.

Can we have one dataset with defined datatypes, but different sizes of
each sample record? If there is other appropriate way of structuring and
writing this data, I would appreciate your inputs.

Thanks,
Anal Patel

Quincey_Koziol · September 4, 2008, 3:51am

Hi Anal,

VERSION:
HDF5 release 1.8

DESCRIPTION OF QUESTION:
We have a nested structure to write data. The inner structure can be array of any length. It is not known ahead of time during the creation of datatype the length or dimension of this struct. As a matter of fact outer structure array of length is also not known. But that I believe could be managed by chunking to the size of one struct, and then extending it each time a new outer struct has to be written.
for example:
typedef struct sample1 {
int val4;
}sample1;
typedef struct sample {
int val1;
char val2;
sample1 s1; // How many records of s1 are there is not known upfront
double val3;
} sample;

sample spl[100];
spl[1] can have 20 s1's; spl[2] can have 50 s1's, ...

[Note: Although the data gets originated from the struct example above, the module writing HDF has no knowledge of defined C/C++ struct which may reside into header file of some other program. However, the hdf module does get information like STRUCT_SAMPLE1_START, INT VAL4,<data>,STRUCT_SAMPLE1_END, etc.]

Is there anyway, we can have variable length of compound data type. There will be one dataset wherein 100 records of struct sample will be stored. The length of the sample may vary depending on number of s1 records it may contain. The max value of s1 as an array of structs is also not known.

Can we have one dataset with defined datatypes, but different sizes of each sample record? If there is other appropriate way of structuring and writing this data, I would appreciate your inputs.

Yes, this is supported in HDF5. You can define a datatype that's a variable-length sequence of a compound datatype (like sample1 above) and then use that VL sequence datatype for a field in another compound datatype. The code in the test/vltypes.c source file in the HDF5 distribution works through lots of combinations of this sort. Perhaps Elena has a standalone example that she could point to also...

Quincey

···

On Sep 2, 2008, at 9:44 AM, Anal K Patel wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

Variable length compound data?