Help requested: how to read a compound datatype

Hi,

I have to write a C-program which is capable to read an compound
dataset from a HDF5 file. Up to now, I have written such code knowing
what to read, so I could hard-code the structure definition in my
program. But in this case, my code has to be generic. I know that I can
query any compound dataset using H5Tget_nmembers, ``H5Tget_member_classand friends. They help me to print out the tag names, tag
classes, etc. But how can I use this information to define a structure
and how can I use this definition to dynamically allocate memory to
read an array of structures.

Any help is appreciated.

Best regards,

Richard van Hees
``

Hi,

  where do you want to read the dataset to? I'd guess if it's a generic reading,
all you can do is to read it into a char data[sizeof(struct)] where you get
the sizeof(struct) from the HDF5 type definition of the file.

If you then know that there's (e.g.) a double contained within this char data[],
then you can typecast at this address and do something with that one...

  Werner

···

On Wed, 21 Oct 2009 09:34:10 -0500, Richard van Hees <R.M.van.Hees@sron.nl> wrote:

Hi,

I have to write a C-program which is capable to read an compound dataset from a
HDF5 file. Up to now, I have written such code knowing what to read, so I could
hard-code the structure definition in my program. But in this case, my code has
to be generic. I know that I can query any compound dataset using H5Tget_nmembers,
H5Tget_member_class and friends. They help me to print out the tag names, tag
classes, etc. But how can I use this information to define a structure and how
can I use this definition to dynamically allocate memory to read an array of
structures.

Any help is appreciated.

Best regards,

Richard van Hees

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Hi Richard,

Hi,

I have to write a C-program which is capable to read an compound dataset from a HDF5 file. Up to now, I have written such code knowing what to read, so I could hard-code the structure definition in my program. But in this case, my code has to be generic. I know that I can query any compound dataset using H5Tget_nmembers, H5Tget_member_class and friends. They help me to print out the tag names, tag classes, etc. But how can I use this information to define a structure and how can I use this definition to dynamically allocate memory to read an array of structures.

In C you cannot define a structure at run-time, so the only way to do something close will be to have a lot "if" statements, allocate memory and read by atomic fields (i.e., by int, float, long, etc. fields)

Another approach will be to use H5Tget_native_type to find HDF5 memory datatype that corresponds to the structure; then you can use H5Tget_size to find the size of the data element in memory, and then allocate an appropriate buffer to read selected data in. But it still doesn't help much, because you will need to "unpack" data from the buffer.

There are high-level API functions H5LTtext_to_dtype and H5LTdtype_to_text which convert between text (a la h5dump output) and HDF5 datatype. May be you will find them useful.

Elena

···

On Oct 21, 2009, at 9:34 AM, Richard van Hees wrote:

Any help is appreciated.

Best regards,

Richard van Hees

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Dear Elena and Werner,

Thanks for your help and suggestions. I somehow hoped that I lacked some knowledge as a C programmer, but I am afraid that I can not acchieve in C what I would like to do. An alternative for my project could be to move to c++. Would that help me?

Best regards, Richard

···

--

-----
Dr. R. M. van Hees
SRON Netherlands Institute for Space Research
Sorbonnelaan 2
3584 CA Utrecht
The Netherlands

tel. +31 (0)30 253 8579 (SRON)
fax. +31 (0)30 254 0860

On 21-10-2009 at 9:54 pm, in message <C033DFCF-CF94-4691-9CB7-298CA6A83BB3@hdfgroup.org>, Elena Pourmal <epourmal@hdfgroup.org> wrote:

Hi Richard,

On Oct 21, 2009, at 9:34 AM, Richard van Hees wrote:

Hi,

I have to write a C-program which is capable to read an compound
dataset from a HDF5 file. Up to now, I have written such code
knowing what to read, so I could hard-code the structure definition
in my program. But in this case, my code has to be generic. I know
that I can query any compound dataset using H5Tget_nmembers,
H5Tget_member_class and friends. They help me to print out the tag
names, tag classes, etc. But how can I use this information to
define a structure and how can I use this definition to dynamically
allocate memory to read an array of structures.

In C you cannot define a structure at run-time, so the only way to do
something close will be to have a lot "if" statements, allocate memory
and read by atomic fields (i.e., by int, float, long, etc. fields)

Another approach will be to use H5Tget_native_type to find HDF5 memory
datatype that corresponds to the structure; then you can use
H5Tget_size to find the size of the data element in memory, and then
allocate an appropriate buffer to read selected data in. But it still
doesn't help much, because you will need to "unpack" data from the
buffer.

There are high-level API functions H5LTtext_to_dtype and
H5LTdtype_to_text which convert between text (a la h5dump output) and
HDF5 datatype. May be you will find them useful.

Elena

Any help is appreciated.

Best regards,

Richard van Hees

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Richard,

   I'm afraid moving to C++ would make the situation worse in your case, since in C++ a class consists of more than just its data members, in particular it also comes with virtual function table (if the class has virtual functions), and there is no way such could be stored in an HDF5 file.

   There may be alternatives such as a registry of objects that are mapped to HDF5 id's and allows dynamic creation of objects based on HDF5 id's (type id's) in return, and there are means to create such registries automatically in C++ involving some template metaprogramming techniques, but that's going to be somewhat complex...

Maybe you can describe your problem in more detail about what you would like to achieve? There might be simpler methods...

cheers,
  Werner

···

On Wed, 21 Oct 2009 15:23:48 -0500, Richard van Hees <R.M.van.Hees@sron.nl> wrote:

Dear Elena and Werner,

Thanks for your help and suggestions. I somehow hoped that I lacked some knowledge as a C programmer, but I am afraid that I can not acchieve in C what I would like to do. An alternative for my project could be to move to c++. Would that help me?

Best regards, Richard

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Hi Richard,

could it be that you're looking for something more generic than you need?
Clearly you know how to read correctly the dataset into a memory buffer,
using the native datatypes and padding. At some point you have to know
what's the information you're reading. You could have a function querying
whether the member Foo exists in the dataset and if yes, get an enumeration
of the type. Then use another function to move on the memory buffer you read
and return a void that you can cast to the correct type. If you want to go
with C++ you will have to use polymorphism features and implement a kind of
reflection for your HDF5 datasets.

HTH

-- dimitris

···

2009/10/21 Richard van Hees <R.M.van.Hees@sron.nl>

Dear Elena and Werner,

Thanks for your help and suggestions. I somehow hoped that I lacked some
knowledge as a C programmer, but I am afraid that I can not acchieve in C
what I would like to do. An alternative for my project could be to move to
c++. Would that help me?

Best regards, Richard

--

-----
Dr. R. M. van Hees
SRON Netherlands Institute for Space Research
Sorbonnelaan 2
3584 CA Utrecht
The Netherlands

tel. +31 (0)30 253 8579 (SRON)
fax. +31 (0)30 254 0860
>>> On 21-10-2009 at 9:54 pm, in message > <C033DFCF-CF94-4691-9CB7-298CA6A83BB3@hdfgroup.org>, Elena Pourmal > <epourmal@hdfgroup.org> wrote:
> Hi Richard,
>
> On Oct 21, 2009, at 9:34 AM, Richard van Hees wrote:
>
>> Hi,
>>
>> I have to write a C-program which is capable to read an compound
>> dataset from a HDF5 file. Up to now, I have written such code
>> knowing what to read, so I could hard-code the structure definition
>> in my program. But in this case, my code has to be generic. I know
>> that I can query any compound dataset using H5Tget_nmembers,
>> H5Tget_member_class and friends. They help me to print out the tag
>> names, tag classes, etc. But how can I use this information to
>> define a structure and how can I use this definition to dynamically
>> allocate memory to read an array of structures.
>>
> In C you cannot define a structure at run-time, so the only way to do
> something close will be to have a lot "if" statements, allocate memory
> and read by atomic fields (i.e., by int, float, long, etc. fields)
>
> Another approach will be to use H5Tget_native_type to find HDF5 memory
> datatype that corresponds to the structure; then you can use
> H5Tget_size to find the size of the data element in memory, and then
> allocate an appropriate buffer to read selected data in. But it still
> doesn't help much, because you will need to "unpack" data from the
> buffer.
>
> There are high-level API functions H5LTtext_to_dtype and
> H5LTdtype_to_text which convert between text (a la h5dump output) and
> HDF5 datatype. May be you will find them useful.
>
> Elena
>
>
>> Any help is appreciated.
>>
>> Best regards,
>>
>> Richard van Hees
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> Hdf-forum@hdfgroup.org
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Werner

you are correct in your conclusion, just wanted to note that in the
particular case he wouldn't have to use inheritance put plain PODs. The
problem with the VFD is not only that it's there, but also that you never
know where it is. What you're describing is a serialization framework of
classes, which considering the hints you give is quite generic, you could
also do it with inheritance if you know a priori the classes you want to
serialize. But stil you have to know the classes, I think in Richard's case
he needs reflection to create objects at run-time and query their contents.
This is somehow easier with Java but more complex with C++ especially in the
handling of integral types.

regards

-- dimitris

···

2009/10/21 Werner Benger <werner@cct.lsu.edu>

Richard,

I'm afraid moving to C++ would make the situation worse in your case,
since in C++ a class consists of more than just its data members, in
particular it also comes with virtual function table (if the class has
virtual functions), and there is no way such could be stored in an HDF5
file.

There may be alternatives such as a registry of objects that are mapped to
HDF5 id's and allows dynamic creation of objects based on HDF5 id's (type
id's) in return, and there are means to create such registries automatically
in C++ involving some template metaprogramming techniques, but that's going
to be somewhat complex...

Maybe you can describe your problem in more detail about what you would
like to achieve? There might be simpler methods...

cheers,
       Werner

On Wed, 21 Oct 2009 15:23:48 -0500, Richard van Hees <R.M.van.Hees@sron.nl> > wrote:

Dear Elena and Werner,

Thanks for your help and suggestions. I somehow hoped that I lacked some
knowledge as a C programmer, but I am afraid that I can not acchieve in C
what I would like to do. An alternative for my project could be to move to
c++. Would that help me?

Best regards, Richard

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

As most people have already said it can be done, with a custom built reflection type mechanism in C++ and some template metaprogramming techniques. That's what we've done for our Hdf wrapper library inside our larger application of Opticks (http://opticks.org/). The Hdf wrapper library is tied pretty close to the rest of the application, so you won't be able to just grab the code.

But you can browse it here:
https://opticks.ballforge.net/source/browse/opticks/trunk/4.3.X/Code/application/HdfPlugInLib/

and it's licensed under the LGPL v2.1

Specifically, look at the Hdf5Dataset::readData() and the Hdf5CustomReader and the Hdf5ReadersWriters.cpp

Hope this helps,
Kip

···

________________________________________
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org] On Behalf Of Dimitris Servis
Sent: Thursday, October 22, 2009 2:15 AM
To: hdf-forum@hdfgroup.org
Subject: Re: [Hdf-forum] Help requested: how to read a compound datatype

Werner

you are correct in your conclusion, just wanted to note that in the particular case he wouldn't have to use inheritance put plain PODs. The problem with the VFD is not only that it's there, but also that you never know where it is. What you're describing is a serialization framework of classes, which considering the hints you give is quite generic, you could also do it with inheritance if you know a priori the classes you want to serialize. But stil you have to know the classes, I think in Richard's case he needs reflection to create objects at run-time and query their contents. This is somehow easier with Java but more complex with C++ especially in the handling of integral types.

regards

-- dimitris
2009/10/21 Werner Benger <werner@cct.lsu.edu>
Richard,

I'm afraid moving to C++ would make the situation worse in your case, since in C++ a class consists of more than just its data members, in particular it also comes with virtual function table (if the class has virtual functions), and there is no way such could be stored in an HDF5 file.

There may be alternatives such as a registry of objects that are mapped to HDF5 id's and allows dynamic creation of objects based on HDF5 id's (type id's) in return, and there are means to create such registries automatically in C++ involving some template metaprogramming techniques, but that's going to be somewhat complex...

Maybe you can describe your problem in more detail about what you would like to achieve? There might be simpler methods...

cheers,
Werner

On Wed, 21 Oct 2009 15:23:48 -0500, Richard van Hees <R.M.van.Hees@sron.nl> wrote:
Dear Elena and Werner,

Thanks for your help and suggestions. I somehow hoped that I lacked some knowledge as a C programmer, but I am afraid that I can not acchieve in C what I would like to do. An alternative for my project could be to move to c++. Would that help me?

Best regards, Richard

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

This message and any enclosures are intended only for the addressee. Please
notify the sender by email if you are not the intended recipient. If you are
not the intended recipient, you may not use, copy, disclose, or distribute this
message or its contents or enclosures to any other person and any such actions
may be unlawful. Ball reserves the right to monitor and review all messages
and enclosures sent to or from this email address.

Dear all,

Thank you now I understand that what I want is not trivial, but I have also an idea how it can be done. Some of you have asked me which problem I try to tackle.

Ik try to improve HDF5 support in the GDL - GNU Data Language, a free IDL (Interactive Data Language, see http://ittvis.com/idl/) compatible incremental compiler. For those who are unfamiliar to IDL (it is somthing like Matlab): it is a commercial tool for data analysis, data visualization, and software application development. IDL fully support HDF5 (version 1.6.x), within IDL you can easily read compound datasets. It returns the compound dataset as a structure with all of it's elements represented by native datatypes (very convenient). I am not rich enough to buy a license for IDL on my computer at home, but for my work I use IDL very often.

Currently, I have implemented most of the HDF5 functions to read from any HDF5 file, except reading of compounds. I have however in the mean time discovered that GDL supports dynamic creation of structures. Thus in the underlaying c++ code someone has implemented a method to build structures on the fly. Thanks to your help I have located the responsible routines and now I try to get help from GDL developers to understand how I can use their functions...

Again thank you very much.

Best regards, Richard

Streithorst, Kip wrote:

···

As most people have already said it can be done, with a custom built reflection type mechanism in C++ and some template metaprogramming techniques. That's what we've done for our Hdf wrapper library inside our larger application of Opticks (http://opticks.org/). The Hdf wrapper library is tied pretty close to the rest of the application, so you won't be able to just grab the code.

But you can browse it here:
https://opticks.ballforge.net/source/browse/opticks/trunk/4.3.X/Code/application/HdfPlugInLib/

and it's licensed under the LGPL v2.1

Specifically, look at the Hdf5Dataset::readData() and the Hdf5CustomReader and the Hdf5ReadersWriters.cpp

Hope this helps,
Kip

________________________________________
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org] On Behalf Of Dimitris Servis
Sent: Thursday, October 22, 2009 2:15 AM
To: hdf-forum@hdfgroup.org
Subject: Re: [Hdf-forum] Help requested: how to read a compound datatype

Werner

you are correct in your conclusion, just wanted to note that in the particular case he wouldn't have to use inheritance put plain PODs. The problem with the VFD is not only that it's there, but also that you never know where it is. What you're describing is a serialization framework of classes, which considering the hints you give is quite generic, you could also do it with inheritance if you know a priori the classes you want to serialize. But stil you have to know the classes, I think in Richard's case he needs reflection to create objects at run-time and query their contents. This is somehow easier with Java but more complex with C++ especially in the handling of integral types.

regards

-- dimitris
2009/10/21 Werner Benger <werner@cct.lsu.edu>
Richard,

I'm afraid moving to C++ would make the situation worse in your case, since in C++ a class consists of more than just its data members, in particular it also comes with virtual function table (if the class has virtual functions), and there is no way such could be stored in an HDF5 file.

There may be alternatives such as a registry of objects that are mapped to HDF5 id's and allows dynamic creation of objects based on HDF5 id's (type id's) in return, and there are means to create such registries automatically in C++ involving some template metaprogramming techniques, but that's going to be somewhat complex...

Maybe you can describe your problem in more detail about what you would like to achieve? There might be simpler methods...

cheers,
       Werner

On Wed, 21 Oct 2009 15:23:48 -0500, Richard van Hees <R.M.van.Hees@sron.nl> wrote:
Dear Elena and Werner,

Thanks for your help and suggestions. I somehow hoped that I lacked some knowledge as a C programmer, but I am afraid that I can not acchieve in C what I would like to do. An alternative for my project could be to move to c++. Would that help me?

Best regards, Richard

I think in Richard's case he needs reflection to create objects at

run-time and query their contents.

Richard, you might want to consider Python. Python is a dynamic,
loosely-typed object-oriented language which provides for the sort of
functionality you are after, particularly reflection (known as
"introspection" in Python). I am currently using Python for an
HDF5-based application with great success. See http://www.python.org
and http://www.pytables.org (for the HDF5 interface).

yours,
David