Backward and Forward compatibility in HDF5

All,

Consider this struct needs to be written into HDF5

struct product

{

int nodel_number;

double manuf_date;

}

I store this in HDF5 via a compound datatype having and int and a double
.

For my next version of the project , I add another double variable into
the struct as below

struct product

{

int nodel_number;

double manuf_date;

double exp_date;

}

Now I will have to change the compound datatype to accommodate another
double data .

I want to read back the information written from the previous version of
my project . But the structure that was written with the old version of
my project misses a double datatype .

Is there some way in HDF5 which can help me in achieving this ? In other
words can HDF5 provide me with something that helps me to allow backward
compatibility ? ( and even forward compatibility )

Thanks and Regards

Ram

Hi Ram,

All,
Consider this struct needs to be written into HDF5
struct product
{
int nodel_number;
double manuf_date;
}

I store this in HDF5 via a compound datatype having and int and a double .

For my next version of the project , I add another double variable into the struct as below
struct product
{
int nodel_number;
double manuf_date;
double exp_date;
}

Now I will have to change the compound datatype to accommodate another double data .
I want to read back the information written from the previous version of my project . But the structure that was written with the old version of my project misses a double datatype .
Is there some way in HDF5 which can help me in achieving this ? In other words can HDF5 provide me with something that helps me to allow backward compatibility ? ( and even forward compatibility )

  HDF5's datatype conversions will handle this situation transparently to your application. You would describe your new datatype in memory, with the 'exp_date' field and then just call H5Dread() on the previous dataset (with the old datatype). The HDF5 library will match the field names of the old & new datatypes and read the "nodel_number" (should be "model_number"? ) and "manuf_date" field from the file into the same fields in the new datatype in memory, leaving the "exp_date" field in memory untouched.

  In the next major release of HDF5, 1.10.0, we'll also be including a feature for changing the datatype of existing datasets, which can avoid the performance penalty for this datatype conversion operation when the dataset is read multiple times.

  Quincey

···

On Mar 28, 2008, at 8:54 AM, Ramakrishnan Iyer wrote:

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Quincy ,
Thanks for your response .
This is my understanding .

I create a dataset with the the given struct
struct product
{
int model_number;
double manuf_date;
}

Let the dataset be named Prod1.
Now I create a hdf5 file and write data into the dataset .

I now modify the product struct and add exp_date .
I then build the application and read the previous file I had written .
I only need to make sure that I pass the old_datatype as the parameter
for type of data to be read . HDF5 will make sure that I read the
correct value .Is this right ?

Coming to the second part of the question how can I achieve forward
compatibility ? I mean can the application built by my old version of
the "product" type read in the file created by new version of the
application ?
What about providing default values for the newer version when reading
the older version ? Can I use H5Dfill or H5Pfill_value ?

Regards
Ram

···

-----Original Message-----
From: Quincey Koziol [mailto:koziol@hdfgroup.org]
Sent: Sunday, March 30, 2008 12:43 AM
To: Ramakrishnan Iyer
Cc: hdf-forum@hdfgroup.org
Subject: Re: Backward and Forward compatibility in HDF5

Hi Ram,

On Mar 28, 2008, at 8:54 AM, Ramakrishnan Iyer wrote:

All,
Consider this struct needs to be written into HDF5
struct product
{
int nodel_number;
double manuf_date;
}

I store this in HDF5 via a compound datatype having and int and a
double .

For my next version of the project , I add another double variable
into the struct as below
struct product
{
int nodel_number;
double manuf_date;
double exp_date;
}

Now I will have to change the compound datatype to accommodate
another double data .
I want to read back the information written from the previous
version of my project . But the structure that was written with the
old version of my project misses a double datatype .
Is there some way in HDF5 which can help me in achieving this ? In
other words can HDF5 provide me with something that helps me to
allow backward compatibility ? ( and even forward compatibility )

  HDF5's datatype conversions will handle this situation
transparently
to your application. You would describe your new datatype in memory,
with the 'exp_date' field and then just call H5Dread() on the previous
dataset (with the old datatype). The HDF5 library will match the
field names of the old & new datatypes and read the
"nodel_number" (should be "model_number"? ) and "manuf_date" field
from the file into the same fields in the new datatype in memory,
leaving the "exp_date" field in memory untouched.

  In the next major release of HDF5, 1.10.0, we'll also be
including a
feature for changing the datatype of existing datasets, which can
avoid the performance penalty for this datatype conversion operation
when the dataset is read multiple times.

  Quincey

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Ram,

Hi Quincy ,
Thanks for your response .
This is my understanding .

I create a dataset with the the given struct
struct product
{
int model_number;
double manuf_date;
}

Let the dataset be named Prod1.
Now I create a hdf5 file and write data into the dataset .

I now modify the product struct and add exp_date .
I then build the application and read the previous file I had written .
I only need to make sure that I pass the old_datatype as the parameter
for type of data to be read . HDF5 will make sure that I read the
correct value .Is this right ?

  Yes, with a slight tweak - the memory datatype is passed to H5Dread() and should be the datatype with the 'exp_date' field (the library already "knows" the datatype of the dataset on disk).

Coming to the second part of the question how can I achieve forward
compatibility ? I mean can the application built by my old version of
the "product" type read in the file created by new version of the
application?

  Again, this should work fine - the HDF5 library will match the fields in a compound datatype using their names. So, if the new application creates datasets with the 'exp_date' field in the compound datatype and the old application reads the new dataset with it's "old" datatype, only the fields in the old datatype will be read from the dataset.

What about providing default values for the newer version when reading
the older version ? Can I use H5Dfill or H5Pfill_value?

  Currently, the HDF5 library does not modify any fields in the compound datatype that don't match, so you'll have to do this filling on your own. I like the idea of being able to provide a fill value for fields in the destination that aren't present in the source, we should add it to our "wish list".

  Quincey

···

On Mar 31, 2008, at 5:31 PM, Ramakrishnan Iyer wrote:

Regards
Ram

-----Original Message-----
From: Quincey Koziol [mailto:koziol@hdfgroup.org]
Sent: Sunday, March 30, 2008 12:43 AM
To: Ramakrishnan Iyer
Cc: hdf-forum@hdfgroup.org
Subject: Re: Backward and Forward compatibility in HDF5

Hi Ram,

On Mar 28, 2008, at 8:54 AM, Ramakrishnan Iyer wrote:

All,
Consider this struct needs to be written into HDF5
struct product
{
int nodel_number;
double manuf_date;
}

I store this in HDF5 via a compound datatype having and int and a
double .

For my next version of the project , I add another double variable
into the struct as below
struct product
{
int nodel_number;
double manuf_date;
double exp_date;
}

Now I will have to change the compound datatype to accommodate
another double data .
I want to read back the information written from the previous
version of my project . But the structure that was written with the
old version of my project misses a double datatype .
Is there some way in HDF5 which can help me in achieving this ? In
other words can HDF5 provide me with something that helps me to
allow backward compatibility ? ( and even forward compatibility )

  HDF5's datatype conversions will handle this situation
transparently
to your application. You would describe your new datatype in memory,
with the 'exp_date' field and then just call H5Dread() on the previous
dataset (with the old datatype). The HDF5 library will match the
field names of the old & new datatypes and read the
"nodel_number" (should be "model_number"? ) and "manuf_date" field
from the file into the same fields in the new datatype in memory,
leaving the "exp_date" field in memory untouched.

  In the next major release of HDF5, 1.10.0, we'll also be
including a
feature for changing the datatype of existing datasets, which can
avoid the performance penalty for this datatype conversion operation
when the dataset is read multiple times.

  Quincey

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.