virtual dataset with hyperslab parameter greater than 32 bit

Hi,

currently I am trying to create a virtual dataset for our measurement data. At the moment the highest sample rate of the measurement system is 100 Hz and I would like to collect the daily files into one virtual dataset which starts on 1st January, 2000 and end on 1st January, 2030. So this is a total dataset length of 86400 s/day * 100 samples/s * 10958 days = 94677120000 which in turn doesn't fit into a 32 bit number.

I use C# in combination with the HDF.PInvoke Nuget package. The parameters of H5S.select_hyperslab (needed to create a virtual dataset) are actually of type uint64 so at this point there is no problem. But when I create that virtual dataset and analyze it later with h5dump, the start, stride, count and block parameters of the hyperslab are truncated to 32 bit values.

In the C source code (hdf5-1.10.0-patch1) I identified three functions that are related to that problem:

  * H5S_hyper_serial_size (to calculate the size that is needed by an hyperslab when it is serialized to a file)
  * H5S_hyper_serialize (to serialize the hyperslab structure to a file)
  * H5S_hyper_deserialize (to serialize the hyperslab structure from a file)

Each function checks if there are any unlimited dimensions in the hyperslab. If true, then the hyperslab parameters are serialized as 64 bit numbers. If false, a 32 bit macro is used. I can see the same behavior also when I inspect the file with an hex-editor and locate the start/stride/count/block parameters (in the 32 bit version I could only find the truncated start parameter).

So my question is: Is this a bug or is there a way to enforce the use of the 64-bit macro without modifying the source code?

I made a short C# example, which should be easily adaptable:

           var fileId = H5F.create(@"C:\Users\Vincent\Desktop\HDF\VDS.h5", H5F.ACC_TRUNC);
           var spaceId_source = H5S.create_simple(1, new ulong[] { 10 }, new ulong[] { H5S.UNLIMITED });
           var spaceId_VDS = H5S.create_simple(1, new ulong[] { 100000000000 }, new ulong[] { H5S.UNLIMITED });
           var dcpl = H5P.create(H5P.DATASET_CREATE);

           H5S.select_hyperslab(spaceId_VDS, H5S.seloper_t.SET, new ulong[] { 90000000000 }, new ulong[] { 1 }, new ulong[] { 1 }, new ulong[] { 10 });
           H5P.set_virtual(dcpl, spaceId_VDS, @"C:\Test.h5", "test_dataset", spaceId_source);

           var datasetId = H5D.create(fileId, "Test", H5T.NATIVE_INT, spaceId_VDS, H5P.DEFAULT, dcpl, H5P.DEFAULT);

           H5S.close(spaceId_VDS);
           H5S.close(spaceId_source);
           H5D.close(datasetId);
           H5F.close(fileId);

If I then analyze that file with h5dump -p --header "C:\Users\Vincent\Desktop\HDF\VDS.h5", I get the following:

HDF5 "C:\Users\Vincent\Desktop\HDF\file.h5" {
...
VIRTUAL {
               SELECTION REGULAR_HYPERSLAB {
                  START (4100654080)
                  STRIDE (1)
                  COUNT (1)
                  BLOCK (10)
               }
            }
            SOURCE {
               FILE "C:\Test.h5"
               DATASET "test_dataset"
               SELECTION ALL
            }

...
}

so the "start" parameter is:

410065408010

···

=

0x00F46B0400

instead it should be:

9000000000010

=

0x14F46B0400

Hopefully someone can guide me into the right direction.

Thank you very much
Vincent

Vincent,

This is a known (to us :wink: issue. We will be addressing it in our next release.

Elena

···

On Apr 13, 2017, at 2:40 PM, Vincent Wilms <vwilms@outlook.com<mailto:vwilms@outlook.com>> wrote:

Hi,

currently I am trying to create a virtual dataset for our measurement data. At the moment the highest sample rate of the measurement system is 100 Hz and I would like to collect the daily files into one virtual dataset which starts on 1st January, 2000 and end on 1st January, 2030. So this is a total dataset length of 86400 s/day * 100 samples/s * 10958 days =94677120000 which in turn doesn‘t fit into a 32 bit number.

I use C# in combination with the HDF.PInvoke Nuget package. The parameters of H5S.select_hyperslab (needed to create a virtual dataset) are actually of type uint64 so at this point there is no problem. But when I create that virtual dataset and analyze it later with h5dump, the start, stride, count and block parameters of the hyperslab are truncated to 32 bit values.

In the C source code (hdf5-1.10.0-patch1) I identified three functions that are related to that problem:

  * H5S_hyper_serial_size (to calculate the size that is needed by an hyperslab when it is serialized to a file)
  * H5S_hyper_serialize (to serialize the hyperslab structure to a file)
  * H5S_hyper_deserialize (to serialize the hyperslab structure from a file)

Each function checks if there are any unlimited dimensions in the hyperslab. If true, then the hyperslab parameters are serialized as 64 bit numbers. If false, a 32 bit macro is used. I can see the same behavior also when I inspect the file with an hex-editor and locate the start/stride/count/block parameters (in the 32 bit version I could only find the truncated start parameter).

So my question is: Is this a bug or is there a way to enforce the use of the 64-bit macro without modifying the source code?

I made a short C# example, which should be easily adaptable:

           var fileId = H5F.create(@"C:\Users\Vincent\Desktop\HDF\VDS.h5", H5F.ACC_TRUNC);
           var spaceId_source = H5S.create_simple(1, new ulong[] { 10 }, new ulong[] { H5S.UNLIMITED });
           var spaceId_VDS = H5S.create_simple(1, new ulong[] { 100000000000 }, new ulong[] { H5S.UNLIMITED });
           var dcpl = H5P.create(H5P.DATASET_CREATE);

           H5S.select_hyperslab(spaceId_VDS, H5S.seloper_t.SET, new ulong[] { 90000000000 }, new ulong[] { 1 }, new ulong[] { 1 }, new ulong[] { 10 });
           H5P.set_virtual(dcpl, spaceId_VDS, @"C:\Test.h5", "test_dataset", spaceId_source);

           var datasetId = H5D.create(fileId, "Test", H5T.NATIVE_INT, spaceId_VDS, H5P.DEFAULT, dcpl, H5P.DEFAULT);

           H5S.close(spaceId_VDS);
           H5S.close(spaceId_source);
           H5D.close(datasetId);
           H5F.close(fileId);

If I then analyze that file with h5dump -p --header "C:\Users\Vincent\Desktop\HDF\VDS.h5", I get the following:

HDF5 "C:\Users\Vincent\Desktop\HDF\file.h5" {

VIRTUAL {
               SELECTION REGULAR_HYPERSLAB {
                  START (4100654080)
                  STRIDE (1)
                  COUNT (1)
                  BLOCK (10)
               }
            }
            SOURCE {
               FILE "C:\Test.h5"
               DATASET "test_dataset"
               SELECTION ALL
            }


}

so the „start“ parameter is:

410065408010

=

0x00F46B0400

instead it should be:

9000000000010

=

0x14F46B0400

Hopefully someone can guide me into the right direction.

Thank you very much
Vincent

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Thanks Elena,

do you mean the release 1.10.1? I just tested the release candidate HDF5 1.10.1-pre2 but still have this issue.

Vincent

···

Von: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] Im Auftrag von Elena Pourmal
Gesendet: Freitag, 14. April 2017 00:04
An: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org>
Betreff: Re: [Hdf-forum] virtual dataset with hyperslab parameter greater than 32 bit

Vincent,

This is a known (to us :wink: issue. We will be addressing it in our next release.

Elena
On Apr 13, 2017, at 2:40 PM, Vincent Wilms <vwilms@outlook.com<mailto:vwilms@outlook.com>> wrote:

Hi,

currently I am trying to create a virtual dataset for our measurement data. At the moment the highest sample rate of the measurement system is 100 Hz and I would like to collect the daily files into one virtual dataset which starts on 1st January, 2000 and end on 1st January, 2030. So this is a total dataset length of 86400 s/day * 100 samples/s * 10958 days =94677120000 which in turn doesn‘t fit into a 32 bit number.

I use C# in combination with the HDF.PInvoke Nuget package. The parameters of H5S.select_hyperslab (needed to create a virtual dataset) are actually of type uint64 so at this point there is no problem. But when I create that virtual dataset and analyze it later with h5dump, the start, stride, count and block parameters of the hyperslab are truncated to 32 bit values.

In the C source code (hdf5-1.10.0-patch1) I identified three functions that are related to that problem:

  * H5S_hyper_serial_size (to calculate the size that is needed by an hyperslab when it is serialized to a file)
  * H5S_hyper_serialize (to serialize the hyperslab structure to a file)
  * H5S_hyper_deserialize (to serialize the hyperslab structure from a file)

Each function checks if there are any unlimited dimensions in the hyperslab. If true, then the hyperslab parameters are serialized as 64 bit numbers. If false, a 32 bit macro is used. I can see the same behavior also when I inspect the file with an hex-editor and locate the start/stride/count/block parameters (in the 32 bit version I could only find the truncated start parameter).

So my question is: Is this a bug or is there a way to enforce the use of the 64-bit macro without modifying the source code?

I made a short C# example, which should be easily adaptable:

           var fileId = H5F.create(@"C:\Users\Vincent\Desktop\HDF\VDS.h5", H5F.ACC_TRUNC);
           var spaceId_source = H5S.create_simple(1, new ulong[] { 10 }, new ulong[] { H5S.UNLIMITED });
           var spaceId_VDS = H5S.create_simple(1, new ulong[] { 100000000000 }, new ulong[] { H5S.UNLIMITED });
           var dcpl = H5P.create(H5P.DATASET_CREATE);

           H5S.select_hyperslab(spaceId_VDS, H5S.seloper_t.SET, new ulong[] { 90000000000 }, new ulong[] { 1 }, new ulong[] { 1 }, new ulong[] { 10 });
           H5P.set_virtual(dcpl, spaceId_VDS, @"C:\Test.h5", "test_dataset", spaceId_source);

           var datasetId = H5D.create(fileId, "Test", H5T.NATIVE_INT, spaceId_VDS, H5P.DEFAULT, dcpl, H5P.DEFAULT);

           H5S.close(spaceId_VDS);
           H5S.close(spaceId_source);
           H5D.close(datasetId);
           H5F.close(fileId);

If I then analyze that file with h5dump -p --header "C:\Users\Vincent\Desktop\HDF\VDS.h5", I get the following:

HDF5 "C:\Users\Vincent\Desktop\HDF\file.h5" {

VIRTUAL {
               SELECTION REGULAR_HYPERSLAB {
                  START (4100654080)
                  STRIDE (1)
                  COUNT (1)
                  BLOCK (10)
               }
            }
            SOURCE {
               FILE "C:\Test.h5"
               DATASET "test_dataset"
               SELECTION ALL
            }


}

so the „start“ parameter is:

410065408010

=

0x00F46B0400

instead it should be:

9000000000010

=

0x14F46B0400

Hopefully someone can guide me into the right direction.

Thank you very much
Vincent

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Vincent,

···

On Apr 14, 2017, at 2:18 AM, Vincent Wilms <vwilms@outlook.com<mailto:vwilms@outlook.com>> wrote:

Thanks Elena,

do you mean the release 1.10.1? I just tested the release candidate HDF5 1.10.1-pre2 but still have this issue.

We schedule to have the fix in 1.10.2. Currently the plan is to release it by the end of the year, but I cannot promise. This fix is quite an effort. We should have a better estimate by the end of Summer.

Elena

Vincent

Von: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] Im Auftrag von Elena Pourmal
Gesendet: Freitag, 14. April 2017 00:04
An: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>>
Betreff: Re: [Hdf-forum] virtual dataset with hyperslab parameter greater than 32 bit

Vincent,

This is a known (to us :wink: issue. We will be addressing it in our next release.

Elena
On Apr 13, 2017, at 2:40 PM, Vincent Wilms <vwilms@outlook.com<mailto:vwilms@outlook.com>> wrote:

Hi,

currently I am trying to create a virtual dataset for our measurement data. At the moment the highest sample rate of the measurement system is 100 Hz and I would like to collect the daily files into one virtual dataset which starts on 1st January, 2000 and end on 1st January, 2030. So this is a total dataset length of 86400 s/day * 100 samples/s * 10958 days =94677120000 which in turn doesn‘t fit into a 32 bit number.

I use C# in combination with the HDF.PInvoke Nuget package. The parameters of H5S.select_hyperslab (needed to create a virtual dataset) are actually of type uint64 so at this point there is no problem. But when I create that virtual dataset and analyze it later with h5dump, the start, stride, count and block parameters of the hyperslab are truncated to 32 bit values.

In the C source code (hdf5-1.10.0-patch1) I identified three functions that are related to that problem:

  * H5S_hyper_serial_size (to calculate the size that is needed by an hyperslab when it is serialized to a file)
  * H5S_hyper_serialize (to serialize the hyperslab structure to a file)
  * H5S_hyper_deserialize (to serialize the hyperslab structure from a file)

Each function checks if there are any unlimited dimensions in the hyperslab. If true, then the hyperslab parameters are serialized as 64 bit numbers. If false, a 32 bit macro is used. I can see the same behavior also when I inspect the file with an hex-editor and locate the start/stride/count/block parameters (in the 32 bit version I could only find the truncated start parameter).

So my question is: Is this a bug or is there a way to enforce the use of the 64-bit macro without modifying the source code?

I made a short C# example, which should be easily adaptable:

           var fileId = H5F.create(@"C:\Users\Vincent\Desktop\HDF\VDS.h5", H5F.ACC_TRUNC);
           var spaceId_source = H5S.create_simple(1, new ulong[] { 10 }, new ulong[] { H5S.UNLIMITED });
           var spaceId_VDS = H5S.create_simple(1, new ulong[] { 100000000000 }, new ulong[] { H5S.UNLIMITED });
           var dcpl = H5P.create(H5P.DATASET_CREATE);

           H5S.select_hyperslab(spaceId_VDS, H5S.seloper_t.SET, new ulong[] { 90000000000 }, new ulong[] { 1 }, new ulong[] { 1 }, new ulong[] { 10 });
           H5P.set_virtual(dcpl, spaceId_VDS, @"C:\Test.h5", "test_dataset", spaceId_source);

           var datasetId = H5D.create(fileId, "Test", H5T.NATIVE_INT, spaceId_VDS, H5P.DEFAULT, dcpl, H5P.DEFAULT);

           H5S.close(spaceId_VDS);
           H5S.close(spaceId_source);
           H5D.close(datasetId);
           H5F.close(fileId);

If I then analyze that file with h5dump -p --header "C:\Users\Vincent\Desktop\HDF\VDS.h5", I get the following:

HDF5 "C:\Users\Vincent\Desktop\HDF\file.h5" {

VIRTUAL {
               SELECTION REGULAR_HYPERSLAB {
                  START (4100654080)
                  STRIDE (1)
                  COUNT (1)
                  BLOCK (10)
               }
            }
            SOURCE {
               FILE "C:\Test.h5"
               DATASET "test_dataset"
               SELECTION ALL
            }


}

so the „start“ parameter is:

410065408010

=

0x00F46B0400

instead it should be:

9000000000010

=

0x14F46B0400

Hopefully someone can guide me into the right direction.

Thank you very much
Vincent

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5