Feature request for h5diff

I have just been bitten by an unexpected behavior of h5diff, which I think should be fixed, although it corresponds to what the documentation says.

I used a plain "h5diff file1 file2" to check if my test runs were producing the expected results. Getting no output made me believe everything was fine. In reality, file2 contained nothing at all!

The documented behavior of h5diff is in fact to compare only datasets that exist in both input files. However, h5diff does consider missing datasets to be a difference, since it returns the exit code 1 meaning "data differs".

It seems that even with additional options, there is no way to simply check if two files have identical contents and get a succinct information about the differences. The exit code (a bit messy to check) just says "equal or not". The only way to get information about missing datasets is the -v option which also produces tons of output.

So here comes my feature request:
a) Ideal version: make h5diff report datasets that exist only in one dataset by default.
b) If such a change of the documented behavior is not acceptable, provide a simple command line option to get the same result.

Konrad.

···

--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
---------------------------------------------------------------------

Hi Konrad,

Thanks for your feedback on h5diff tool!

About your feature request, we actually have a plan to come up with an option (eg, --common-obj-only).
With the option h5diff will only compares same named objects in both files. It sounds same as now, but the exit code also will be resulted only within the common objects.

About the visual output, I understand some more information in a certain cases like this could be helpful.
We will need to discuss and see how we can improve it.
Until then, we recommend for user to check both visual output (-r or -v option) and exit code to determine the status result.

Thank you!

- HDF5 tool team

···

On 4/21/2011 9:38 AM, Konrad Hinsen wrote:

I have just been bitten by an unexpected behavior of h5diff, which I think should be fixed, although it corresponds to what the documentation says.

I used a plain "h5diff file1 file2" to check if my test runs were producing the expected results. Getting no output made me believe everything was fine. In reality, file2 contained nothing at all!

The documented behavior of h5diff is in fact to compare only datasets that exist in both input files. However, h5diff does consider missing datasets to be a difference, since it returns the exit code 1 meaning "data differs".

It seems that even with additional options, there is no way to simply check if two files have identical contents and get a succinct information about the differences. The exit code (a bit messy to check) just says "equal or not". The only way to get information about missing datasets is the -v option which also produces tons of output.

So here comes my feature request:
a) Ideal version: make h5diff report datasets that exist only in one dataset by default.
b) If such a change of the documented behavior is not acceptable, provide a simple command line option to get the same result.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Mol�culaire, CNRS Orl�ans
Synchrotron Soleil - Division Exp�riences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
---------------------------------------------------------------------

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Konrad,

Thank you for reporting the issue. We will take a look at it.

Thanks
--pc

···

On 4/21/2011 9:38 AM, Konrad Hinsen wrote:

I have just been bitten by an unexpected behavior of h5diff, which I think should be fixed, although it corresponds to what the documentation says.

I used a plain "h5diff file1 file2" to check if my test runs were producing the expected results. Getting no output made me believe everything was fine. In reality, file2 contained nothing at all!

The documented behavior of h5diff is in fact to compare only datasets that exist in both input files. However, h5diff does consider missing datasets to be a difference, since it returns the exit code 1 meaning "data differs".

It seems that even with additional options, there is no way to simply check if two files have identical contents and get a succinct information about the differences. The exit code (a bit messy to check) just says "equal or not". The only way to get information about missing datasets is the -v option which also produces tons of output.

So here comes my feature request:
a) Ideal version: make h5diff report datasets that exist only in one dataset by default.
b) If such a change of the documented behavior is not acceptable, provide a simple command line option to get the same result.

Konrad.
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
---------------------------------------------------------------------

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org