Is there an idiomatic way to check if all members of a compound data-type are present in a data-set before reading?
A collaborator sent me a file where one member name of a compound data-type had been mis-spelled, which led to a junk value filling this field after the H5Dread call. I’m wondering what the best way to detect this and fail gracefully is. I have rigged something up with H5Tget_nmembers and then iterating through H5Tget_member_name, is there a better way?
In this case, I think the fundamental problem is that what constitutes an error is in the eye of the beholder. We support partial I/O for compund datatypes, which means that you can read or write just a subset of fields. It’s based on matching field names, and non-matches are just ignored. Many applications “rely” on this behavior. Doing what your are asking in the library would drastically complicate the error handling of H5D[read,write], because what’s an error would suddenly depend on the datatypes involved and user preference.
Assuming your are reading all fields, I think a reasonbly reliable check would be to use H5Tequal to compare the in-memory datatype and the native version (H5Tget_native_type) of your in-file (H5Dget_type) datatypes. If they match, you are good. Otherwise you’ll need a plan B to work out the cause of the mismatch.
I think H5Tequal will not like permutations of fields and datatype conversions. The match has to be “perfect:” same field names, same datatype, same order.
Okay, sounds like what I’ve got already will suffice then. H5Tequal sounds a bit strict.
This is indeed a very tricky situation to please everyone, as for instance I’m fine with re-orderings and basic datatype conversions, but I have some fields that are, for example, arrays of three floats and having two or four would be an error.