Handling incomplete columns for integer datatypes

Dear Forum,

I encounter some conceptual problems, when trying to store incomplete columns for integer datatypes.

My issue refers to the problem expalined here:

Basically the author wants to store this table:

time | x1 | y1 | x2 | y2 |

0 | 2.0 | 1.0 | 2.0 | 3.0 |

1 | 2.1 | 1.0 | 2.3 | 3.1 |

2 | 2.4 | 1.4 | | |

3 | 2.2 | 1.5 | 2.4 | 3.1 |

4 | | | 2.3 | 3.2 |

I tcontains incomplete columns of floating datatypes and can be solved by filling in NaNs.

For other datatypes, e.g. integer, there is no such NaN available. Is there some kind of textbook approach that decribes how to handle this problem?

Thanks,

Daniel

Daniel, unless "magic values" are accompanied by unambiguous metadata the
NaN/fill-value approach is risky and limits portability.
A better approach might be to maintain a "shadowing" dataset of a suitable
bitfield type whose values would indicate the validity of a dataset element
or fields in a compound, etc. By using compression on the shadowing dataset
the storage overhead should be negligible. Of course, if the shadowed dataset
ever gets updated and elements or fields change their status between
'valid' and 'N/A', both datasets must be updated and kept in sync.

Depending on how elaborate you want this to be, you could decorate the shadowed dataset
with a "MASK" attribute whose value is an object reference to the shadowing dataset.
Alternatively, if the shadowed dataset is linked to exactly one group and there is no potential for
name conflicts, you could have a convention that lets you derive the name of the shadowing dataset
from the link name of the shadowed dataset.

Best, G.

ยทยทยท

________________________________________
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org> on behalf of Daniel Rimmelspacher <danervt@hotmail.com>
Sent: Thursday, January 12, 2017 6:15:32 AM
To: HDF Users Discussion List
Subject: [Hdf-forum] Handling incomplete columns for integer datatypes

Dear Forum,

I encounter some conceptual problems, when trying to store incomplete columns for integer datatypes.

My issue refers to the problem expalined here:

Basically the author wants to store this table:

time | x1 | y1 | x2 | y2 |

0 | 2.0 | 1.0 | 2.0 | 3.0 |

1 | 2.1 | 1.0 | 2.3 | 3.1 |

2 | 2.4 | 1.4 | | |

3 | 2.2 | 1.5 | 2.4 | 3.1 |

4 | | | 2.3 | 3.2 |

I tcontains incomplete columns of floating datatypes and can be solved by filling in NaNs.

For other datatypes, e.g. integer, there is no such NaN available. Is there some kind of textbook approach that decribes how to handle this problem?

Thanks,

Daniel