Dataspace dimensions vs MATLAB R2006b

Hello all

I recently starting using HDF5. I have a question regarding the order of dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This is also the order in which the dimensions are passed to the H5Screate_simple function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
      Filename: 'sds.h5'
          Name: '/IntArray'
          Rank: 2
      Datatype: [1x1 struct]
          Dims: [6 5]
       MaxDims: [6 5]
        Layout: 'contiguous'
    Attributes: []
         Links: []
     Chunksize: []
     FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
           0 1 2 3 4
           1 2 3 4 5
           2 3 4 5 6
           3 4 5 6 7
           4 5 6 7 8
           5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Albert, MATLAB has its roots in column-major ordering, while C uses row-major
ordering. This means by default the matrices will be transposes of what you
might expect if you are thinking in C mode.

If you are using the HDF5READ function, then you should read down at the end
of the help for HDF5READ for the 'V71Dimensions' option.

···

On Wednesday 29 August 2007 17:17, Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This is
also the order in which the dimensions are passed to the H5Screate_simple
function.

However, when I read this file using MATLAB R2006b, I get:
>> info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
      Filename: 'sds.h5'
          Name: '/IntArray'
          Rank: 2
      Datatype: [1x1 struct]
          Dims: [6 5]
       MaxDims: [6 5]
        Layout: 'contiguous'
    Attributes: []
         Links: []
     Chunksize: []
     FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

>> x = hdf5read('sds.h5', '/IntArray')

x =
           0 1 2 3 4
           1 2 3 4 5
           2 3 4 5 6
           3 4 5 6 7
           4 5 6 7 8
           5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org. To unsubscribe, send a message to
hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hello all

Firstly, thanks John. Some reshaping and reversing of directions allows me to get my data in the proper form in MATLAB.

However, I'm curious to know how HDF users handle this kind of sitation in general.

In my case, I'm mostly interested in storing 2-d arrays of floats in HDF files at this point. At any given time my applications might be working with arrays that are row- or column-major order. However, when I analyze the data written by my application, I'd prefer it if I didn't have to guess at the ordering of the datasets in my HDF files.

I could always write my data with a fixed ordering, but if I give my files to someone else, I would also have to tell them on the ordering I chose. Potential for errors abound!

Another solution that comes to mind is to attach an attribute to each dataset when I write it, indicating whether the data is row- or column-major. That way, when I or someone else reads it in MATLAB or Python or whatever, we can have some code to figure out whether to reverse the dimensions and/or transpose the actual data.

Any thoughts would be appreciated. Thanks.

Regards,

Albert

···

----- Original Message ----- From: "John Evans" <john.evans@mathworks.com>
To: <hdf-forum@hdfgroup.org>
Sent: Thursday, August 30, 2007 12:06 AM
Subject: Re: Dataspace dimensions vs MATLAB R2006b

Albert, MATLAB has its roots in column-major ordering, while C uses row-major
ordering. This means by default the matrices will be transposes of what you
might expect if you are thinking in C mode.

If you are using the HDF5READ function, then you should read down at the end
of the help for HDF5READ for the 'V71Dimensions' option.

On Wednesday 29 August 2007 17:17, Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This is
also the order in which the dimensions are passed to the H5Screate_simple
function.

However, when I read this file using MATLAB R2006b, I get:
>> info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
      Filename: 'sds.h5'
          Name: '/IntArray'
          Rank: 2
      Datatype: [1x1 struct]
          Dims: [6 5]
       MaxDims: [6 5]
        Layout: 'contiguous'
    Attributes: []
         Links: []
     Chunksize: []
     FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

>> x = hdf5read('sds.h5', '/IntArray')

x =
           0 1 2 3 4
           1 2 3 4 5
           2 3 4 5 6
           3 4 5 6 7
           4 5 6 7 8
           5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

HDF5 stores data in C order (row-wise). You have to use

x = hdf5read(...)'

in Matlab to correct for it. IMO it is a flaw; storage order should be
transparent to the user as is endiannes.

- Dominik

Albert Strasheim wrote:

···

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This
is also the order in which the dimensions are passed to the
H5Screate_simple function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
     Filename: 'sds.h5'
         Name: '/IntArray'
         Rank: 2
     Datatype: [1x1 struct]
         Dims: [6 5]
      MaxDims: [6 5]
       Layout: 'contiguous'
   Attributes: []
        Links: []
    Chunksize: []
    FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
          0 1 2 3 4
          1 2 3 4 5
          2 3 4 5 6
          3 4 5 6 7
          4 5 6 7 8
          5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Albert,

Hello all

Firstly, thanks John. Some reshaping and reversing of directions allows me to get my data in the proper form in MATLAB.

However, I'm curious to know how HDF users handle this kind of sitation in general.

In my case, I'm mostly interested in storing 2-d arrays of floats in HDF files at this point. At any given time my applications might be working with arrays that are row- or column-major order. However, when I analyze the data written by my application, I'd prefer it if I didn't have to guess at the ordering of the datasets in my HDF files.

I could always write my data with a fixed ordering, but if I give my files to someone else, I would also have to tell them on the ordering I chose. Potential for errors abound!

HDF5 interprets the data according to the dataset dimensions that you declared using H5Screate_simple call. For example, if you create 2D dataset and declare its dimensions to be 4,5, then 20 integers in the file will be interpreted as 4 rows by 5 elements.

If the same array is read from Fortran, HDF5 Fortran library flips dimensions of the dataset to 5, 4 and you will will read into 5x4 matrix following Fortran convention order by columns.

HDF5 doesn't do anything with the user's data.

Does it make sense?

···

At 12:50 AM +0200 8/30/07, Albert Strasheim wrote:

Another solution that comes to mind is to attach an attribute to each dataset when I write it, indicating whether the data is row- or column-major. That way, when I or someone else reads it in MATLAB or Python or whatever, we can have some code to figure out whether to reverse the dimensions and/or transpose the actual data.

Any thoughts would be appreciated. Thanks.

Regards,

Albert

----- Original Message ----- From: "John Evans" <john.evans@mathworks.com>
To: <hdf-forum@hdfgroup.org>
Sent: Thursday, August 30, 2007 12:06 AM
Subject: Re: Dataspace dimensions vs MATLAB R2006b

Albert, MATLAB has its roots in column-major ordering, while C uses row-major
ordering. This means by default the matrices will be transposes of what you
might expect if you are thinking in C mode.

If you are using the HDF5READ function, then you should read down at the end
of the help for HDF5READ for the 'V71Dimensions' option.

On Wednesday 29 August 2007 17:17, Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This is
also the order in which the dimensions are passed to the H5Screate_simple
function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
      Filename: 'sds.h5'
          Name: '/IntArray'
          Rank: 2
      Datatype: [1x1 struct]
          Dims: [6 5]
       MaxDims: [6 5]
        Layout: 'contiguous'
    Attributes: []
         Links: []
     Chunksize: []
     FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
           0 1 2 3 4
           1 2 3 4 5
           2 3 4 5 6
           3 4 5 6 7
           4 5 6 7 8
           5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--

------------------------------------------------------------
Elena Pourmal
The HDF Group
1901 So First ST.
Suite C-2
Champaign, IL 61820

epourmal@hdfgroup.org
(217)333-0238 (office)
(217)333-9049 (fax)
------------------------------------------------------------

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi all,

HDF5 stores data in C order (row-wise). You have to use

x = hdf5read(...)'

in Matlab to correct for it. IMO it is a flaw; storage order should be
transparent to the user as is endiannes.

  HDF5 does store the data in C order (row-major), so that's what applications should target and expect.

  We (the HDF Group) have talked about (and have provided in the format specification) mechanisms for storing data in order dimension permutations (the most obvious/common one being column-major), but we've never received any demand from a funding organization to flesh that functionality out and implement it fully. (Contributed code would be fine also! :slight_smile:

  Here's a link to the Dataspace chapter of the HDF5 User's Guide:

http://hdfgroup.org/HDF5/doc/UG/12_Dataspaces.html

  Search for "C versus Fortran Dataspaces" in the document for more detailed information and advice.

  Quincey

···

On Aug 30, 2007, at 1:50 AM, Dominik Szczerba wrote:

- Dominik

Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This
is also the order in which the dimensions are passed to the
H5Screate_simple function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
     Filename: 'sds.h5'
         Name: '/IntArray'
         Rank: 2
     Datatype: [1x1 struct]
         Dims: [6 5]
      MaxDims: [6 5]
       Layout: 'contiguous'
   Attributes: []
        Links: []
    Chunksize: []
    FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
          0 1 2 3 4
          1 2 3 4 5
          2 3 4 5 6
          3 4 5 6 7
          4 5 6 7 8
          5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Albert Strasheim wrote:

However, I'm curious to know how HDF users handle this kind of sitation in general.

Another solution that comes to mind is to attach an attribute to each dataset when I write it, indicating whether the data is row- or column-major. That way, when I or someone else reads it in MATLAB or Python or whatever, we can have some code to figure out whether to reverse the dimensions and/or transpose the actual data.

Albert,

I'm not sure there is any general method used for this. The problem, in my mind, is not so much to indicate the row or column major attribute of the array (because these ideas become more complex as you move to more than 2 dimensions) but carry the semantics of each dimension along in the HDF file. Thus, if your 4x5 array has dimensions of time x distance then the critical information for the application is which dimension is time and which is distance. Applications using the data can then automatically do the right thing based on this information.

HDF4 has this facility built in, however it was removed in HDF5. There has been some discussion by HDF5 developers on providing this functionality in HDF5. Right now I can't access past presentations on the web site so I don't have the details handy.

Naming the array dimensions offers another benefit. The same name can be used when you have the same independent variable as an array dimensions in other arrays in the same dataset.

The EOS missions have been using HDFEOS library which is built on top of HDF (there is a different version for HDF4 and HDF5). When the version for HDF5 was created the array dimension attribute capability was "bolted on." (I'm not a developer and I don't know how this was done.) However support for this version of HDFEOS outside of C and Fortran is poor and it probably isn't very useful if you're working outside of remote sensing types of projects.

--dan

···

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0. Back
trace follows.
  #000: H5F.c line 2048 in H5Fopen(): unable to open file
    major(04): File interface
    minor(17): Unable to open file
  #001: H5F.c line 1828 in H5F_open(): unable to read superblock
    major(04): File interface
    minor(24): Read failed
  #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
    major(04): File interface
    minor(19): Not an HDF5 file
  #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
    major(05): Low-level I/O layer
    minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

- Dominik

Dominik Szczerba wrote:

···

HDF5 stores data in C order (row-wise). You have to use

x = hdf5read(...)'

in Matlab to correct for it. IMO it is a flaw; storage order should be
transparent to the user as is endiannes.

- Dominik

Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This
is also the order in which the dimensions are passed to the
H5Screate_simple function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
     Filename: 'sds.h5'
         Name: '/IntArray'
         Rank: 2
     Datatype: [1x1 struct]
         Dims: [6 5]
      MaxDims: [6 5]
       Layout: 'contiguous'
   Attributes: []
        Links: []
    Chunksize: []
    FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
          0 1 2 3 4
          1 2 3 4 5
          2 3 4 5 6
          3 4 5 6 7
          4 5 6 7 8
          5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

I agree with the need to provide variables to relate the dimensions of a
dependent data set with those of the independent data sets it is tied
to. Looks like there is some support for that in HDF5 version 1.8 coming
up:

http://www.hdfgroup.uiuc.edu/HDF5/doc_1.8pre/WhatsNew180.html

See Dimension Scale API. I believe this was done to support netCFD
version 4 named dimension requirements since nedCDF4 uses HDF5 under the
hood.

Dan Russell
(425)234-3680

···

-----Original Message-----
From: Daniel Kahn [mailto:daniel_kahn@ssaihq.com]
Sent: Thursday, August 30, 2007 7:41 AM
To: Albert Strasheim
Cc: hdf-forum@hdfgroup.org
Subject: Re: Dataspace dimensions vs MATLAB R2006b

Albert Strasheim wrote:

However, I'm curious to know how HDF users handle this kind of
sitation in general.

Another solution that comes to mind is to attach an attribute to each
dataset when I write it, indicating whether the data is row- or
column-major. That way, when I or someone else reads it in MATLAB or
Python or whatever, we can have some code to figure out whether to
reverse the dimensions and/or transpose the actual data.

Albert,

I'm not sure there is any general method used for this. The problem, in
my mind, is not so much to indicate the row or column major attribute of
the array (because these ideas become more complex as you move to more
than 2 dimensions) but carry the semantics of each dimension along in
the HDF file. Thus, if your 4x5 array has dimensions of time x distance
then the critical information for the application is which dimension is
time and which is distance. Applications using the data can then
automatically do the right thing based on this information.

HDF4 has this facility built in, however it was removed in HDF5. There
has been some discussion by HDF5 developers on providing this
functionality in HDF5. Right now I can't access past presentations on
the web site so I don't have the details handy.

Naming the array dimensions offers another benefit. The same name can
be used when you have the same independent variable as an array
dimensions in other arrays in the same dataset.

The EOS missions have been using HDFEOS library which is built on top of
HDF (there is a different version for HDF4 and HDF5). When the version
for HDF5 was created the array dimension attribute capability was
"bolted on." (I'm not a developer and I don't know how this was done.)
However support for this version of HDFEOS outside of C and Fortran is
poor and it probably isn't very useful if you're working outside of
remote sensing types of projects.

--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Elena Pourmal wrote:

Albert,

Hello all

Firstly, thanks John. Some reshaping and reversing of directions
allows me to get my data in the proper form in MATLAB.

However, I'm curious to know how HDF users handle this kind of
sitation in general.

In my case, I'm mostly interested in storing 2-d arrays of floats in
HDF files at this point. At any given time my applications might be
working with arrays that are row- or column-major order. However, when
I analyze the data written by my application, I'd prefer it if I
didn't have to guess at the ordering of the datasets in my HDF files.

I could always write my data with a fixed ordering, but if I give my
files to someone else, I would also have to tell them on the ordering
I chose. Potential for errors abound!

HDF5 interprets the data according to the dataset dimensions that you
declared using H5Screate_simple call. For example, if you create 2D
dataset and declare its dimensions to be 4,5, then 20 integers in the
file will be interpreted as 4 rows by 5 elements.

If the same array is read from Fortran, HDF5 Fortran library flips
dimensions of the dataset to 5, 4 and you will will read into 5x4 matrix
following Fortran convention order by columns.

HDF5 doesn't do anything with the user's data.

Does it make sense?

It does. However, Matlab won't do it. Maybe this should be reported to
the Mathworks (I cant imagine they would wish to change anything, though).

- Dominik

···

At 12:50 AM +0200 8/30/07, Albert Strasheim wrote:

Another solution that comes to mind is to attach an attribute to each
dataset when I write it, indicating whether the data is row- or
column-major. That way, when I or someone else reads it in MATLAB or
Python or whatever, we can have some code to figure out whether to
reverse the dimensions and/or transpose the actual data.

Any thoughts would be appreciated. Thanks.

Regards,

Albert

----- Original Message ----- From: "John Evans"
<john.evans@mathworks.com>
To: <hdf-forum@hdfgroup.org>
Sent: Thursday, August 30, 2007 12:06 AM
Subject: Re: Dataspace dimensions vs MATLAB R2006b

Albert, MATLAB has its roots in column-major ordering, while C uses
row-major
ordering. This means by default the matrices will be transposes of
what you
might expect if you are thinking in C mode.

If you are using the HDF5READ function, then you should read down at
the end
of the help for HDF5READ for the 'V71Dimensions' option.

On Wednesday 29 August 2007 17:17, Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the
order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension).
This is
also the order in which the dimensions are passed to the
H5Screate_simple
function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
      Filename: 'sds.h5'
          Name: '/IntArray'
          Rank: 2
      Datatype: [1x1 struct]
          Dims: [6 5]
       MaxDims: [6 5]
        Layout: 'contiguous'
    Attributes: []
         Links: []
     Chunksize: []
     FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
           0 1 2 3 4
           1 2 3 4 5
           2 3 4 5 6
           3 4 5 6 7
           4 5 6 7 8
           5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0. Back
trace follows.
  #000: H5F.c line 2048 in H5Fopen(): unable to open file
    major(04): File interface
    minor(17): Unable to open file
  #001: H5F.c line 1828 in H5F_open(): unable to read superblock
    major(04): File interface
    minor(24): Read failed
  #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
    major(04): File interface
    minor(19): Not an HDF5 file
  #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
    major(05): Low-level I/O layer
    minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

- Dominik

···

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

sorry to all, this should have been sent with a dedicated Subject (next
mail).
- DS

Dominik Szczerba wrote:

···

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0. Back
trace follows.
  #000: H5F.c line 2048 in H5Fopen(): unable to open file
    major(04): File interface
    minor(17): Unable to open file
  #001: H5F.c line 1828 in H5F_open(): unable to read superblock
    major(04): File interface
    minor(24): Read failed
  #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
    major(04): File interface
    minor(19): Not an HDF5 file
  #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
    major(05): Low-level I/O layer
    minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

- Dominik

Dominik Szczerba wrote:

HDF5 stores data in C order (row-wise). You have to use

x = hdf5read(...)'

in Matlab to correct for it. IMO it is a flaw; storage order should be
transparent to the user as is endiannes.

- Dominik

Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This
is also the order in which the dimensions are passed to the
H5Screate_simple function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
     Filename: 'sds.h5'
         Name: '/IntArray'
         Rank: 2
     Datatype: [1x1 struct]
         Dims: [6 5]
      MaxDims: [6 5]
       Layout: 'contiguous'
   Attributes: []
        Links: []
    Chunksize: []
    FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
          0 1 2 3 4
          1 2 3 4 5
          2 3 4 5 6
          3 4 5 6 7
          4 5 6 7 8
          5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Dominik,

If system error occurred while file was open, there is a great chance that HDF5 file becomes corrupted

Our group has been working now on the solution. Please see http://www.hdfgroup.uiuc.edu/RFC/HDF5/journaling/ Currently there is no solution.

Elena

···

At 11:07 AM +0200 8/31/07, Dominik Szczerba wrote:

sorry to all, this should have been sent with a dedicated Subject (next
mail).
- DS

Dominik Szczerba wrote:

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0. Back
trace follows.
   #000: H5F.c line 2048 in H5Fopen(): unable to open file
     major(04): File interface
     minor(17): Unable to open file
   #001: H5F.c line 1828 in H5F_open(): unable to read superblock
     major(04): File interface
     minor(24): Read failed
   #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
     major(04): File interface
     minor(19): Not an HDF5 file
   #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
     major(05): Low-level I/O layer
     minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

- Dominik

Dominik Szczerba wrote:

HDF5 stores data in C order (row-wise). You have to use

x = hdf5read(...)'

in Matlab to correct for it. IMO it is a flaw; storage order should be
transparent to the user as is endiannes.

- Dominik

Albert Strasheim wrote:

Hello all

I recently starting using HDF5. I have a question regarding the order of
dimensions in a dataspace.

Looking at an example like this one:

http://www.hdfgroup.org/HDF5/Tutor/examples/C/h5_hyperslab.c

Looking at the data being written, it seems the intention is to write a
dataset with 5 rows (0th dimension) and 6 columns (1st dimension). This
is also the order in which the dimensions are passed to the
H5Screate_simple function.

However, when I read this file using MATLAB R2006b, I get:

info = hdf5info('sds.h5'); info.GroupHierarchy.Datasets(1)

ans =
      Filename: 'sds.h5'
          Name: '/IntArray'
          Rank: 2
      Datatype: [1x1 struct]
          Dims: [6 5]
       MaxDims: [6 5]
        Layout: 'contiguous'
    Attributes: []
         Links: []
     Chunksize: []
     FillValue: 0

Dims and MaxDims are the wrong way round. Also, if I do

x = hdf5read('sds.h5', '/IntArray')

x =
           0 1 2 3 4
           1 2 3 4 5
           2 3 4 5 6
           3 4 5 6 7
           4 5 6 7 8
           5 6 7 8 9

What's going on here?

Thanks for any replies.

Regards,

Albert Strasheim

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--

------------------------------------------------------------
Elena Pourmal
The HDF Group
1901 So First ST.
Suite C-2
Champaign, IL 61820

epourmal@hdfgroup.org
(217)333-0238 (office)
(217)333-9049 (fax)
------------------------------------------------------------

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Dominik,

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0. Back
trace follows.
  #000: H5F.c line 2048 in H5Fopen(): unable to open file
    major(04): File interface
    minor(17): Unable to open file
  #001: H5F.c line 1828 in H5F_open(): unable to read superblock
    major(04): File interface
    minor(24): Read failed
  #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
    major(04): File interface
    minor(19): Not an HDF5 file
  #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
    major(05): Low-level I/O layer
    minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

  What were the circumstances when the file was created? Did the creation process get interrupted? Was another process writing to the file at the same time?

  Quincey

···

On Aug 31, 2007, at 4:06 AM, Dominik Szczerba wrote:

- Dominik

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Quincey,

Quincey Koziol wrote:

Hi Dominik,

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0. Back
trace follows.
  #000: H5F.c line 2048 in H5Fopen(): unable to open file
    major(04): File interface
    minor(17): Unable to open file
  #001: H5F.c line 1828 in H5F_open(): unable to read superblock
    major(04): File interface
    minor(24): Read failed
  #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
    major(04): File interface
    minor(19): Not an HDF5 file
  #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
    major(05): Low-level I/O layer
    minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

    What were the circumstances when the file was created? Did the
creation process get interrupted? Was another process writing to the
file at the same time?

Good question. This was a condor job. I went through logs and indeed
found out that the problem occured when the job was evicted and re-run
on a different host. I am running vanilla universe (longer story why)
and indeed that means condor can send a hard kill to a process, possibly
interrupting it during writing. Thanks for a good question.

So H5Fis_hdf5 can do some good for me here. But of course more general
handling of such cases in HDF would be most welcome as well.

Thanks again,
Dominik

···

On Aug 31, 2007, at 4:06 AM, Dominik Szczerba wrote:

    Quincey

- Dominik

--Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi,
I have written two small C++ classes, a HDF5 reader and writer. Example
usage:

  int ok = HDF5Reader::existsValidHdf5(argv[1]);

...

  int DIM1 = 10;
  vector<int> dims1(2);
  dims1[0] = 5, dims1[1] = 2;
  string type1 = "double";

...

  HDF5Writer writer;
  writer.loud=1;
  writer.open("test.h5");
  writer.compression = 9;
  writer.write(&x1[0], dims1, "x1");
...
  writer.createGroup("test");
...
  writer.close();

...

  vector<double> x11;
  vector<int> dims;
  int size = 0;
  HDF5Reader reader;
  reader.loud = 1;
  reader.open("test.h5");
  reader.getExtents("x1",dims);
  size = product(&dims[0],dims.size());
  x11.resize(size);
  reader.read(&x11[0], "x1");
  reader.close();

It was inspired by matlab's hdf5read/write, but in addition supports
zlib compression. It works with native C arrays/pointers and
can be easilly used with higher level containers like std::vector or
blitz++ (whatever provides a pointer to the first element in memory).

If there was interest I could publish the code.

- Dominik

···

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi Quincey,

Quincey Koziol wrote:

Hi Dominik,

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0. Back
trace follows.
  #000: H5F.c line 2048 in H5Fopen(): unable to open file
    major(04): File interface
    minor(17): Unable to open file
  #001: H5F.c line 1828 in H5F_open(): unable to read superblock
    major(04): File interface
    minor(24): Read failed
  #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
    major(04): File interface
    minor(19): Not an HDF5 file
  #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
    major(05): Low-level I/O layer
    minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

    What were the circumstances when the file was created? Did the
creation process get interrupted? Was another process writing to the
file at the same time?

Good question. This was a condor job. I went through logs and indeed
found out that the problem occured when the job was evicted and re-run
on a different host. I am running vanilla universe (longer story why)
and indeed that means condor can send a hard kill to a process, possibly
interrupting it during writing. Thanks for a good question.

  Hmm, interesting... Can you arrange for your process to catch the kill signal and call H5Fflush()? Or maybe condor has a mechanism for registering a callback it can trigger before migrating a job?

So H5Fis_hdf5 can do some good for me here. But of course more general
handling of such cases in HDF would be most welcome as well.

  The more general solution will the the "crash proofing" work that Elena posted about earlier. That will address these sort of issues, at least for metadata modifications.

  Quincey

···

On Aug 31, 2007, at 11:50 AM, Dominik Szczerba wrote:

On Aug 31, 2007, at 4:06 AM, Dominik Szczerba wrote:

Thanks again,
Dominik

    Quincey

- Dominik

--Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Quincey Koziol wrote:

Hi Quincey,

Quincey Koziol wrote:

Hi Dominik,

Hi,
I get the following error on an attempt to read a h5 file:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.5 thread 0.
Back
trace follows.
  #000: H5F.c line 2048 in H5Fopen(): unable to open file
    major(04): File interface
    minor(17): Unable to open file
  #001: H5F.c line 1828 in H5F_open(): unable to read superblock
    major(04): File interface
    minor(24): Read failed
  #002: H5Fsuper.c line 113 in H5F_read_superblock(): unable to find
file signature
    major(04): File interface
    minor(19): Not an HDF5 file
  #003: H5F.c line 1280 in H5F_locate_signature(): unable to find a
valid file signature
    major(05): Low-level I/O layer
    minor(29): Unable to initialize object

Matlab a bit more comprehensively says it is not a valid hdf5 file. It
is not an empty file neither truncated by insufficient space on disk.
How can I know what went wrong?

    What were the circumstances when the file was created? Did the
creation process get interrupted? Was another process writing to the
file at the same time?

Good question. This was a condor job. I went through logs and indeed
found out that the problem occured when the job was evicted and re-run
on a different host. I am running vanilla universe (longer story why)
and indeed that means condor can send a hard kill to a process, possibly
interrupting it during writing. Thanks for a good question.

    Hmm, interesting... Can you arrange for your process to catch the
kill signal and call H5Fflush()? Or maybe condor has a mechanism for
registering a callback it can trigger before migrating a job?

Even better should be simply run jobs in the standard universe, only
condor_compile currently fails for me. is_hdf5() currently works fine
for me, I simply repeat the last iteration if the file is not valid.

thanks a lot for a useful feedback,
Dominik

···

On Aug 31, 2007, at 11:50 AM, Dominik Szczerba wrote:

On Aug 31, 2007, at 4:06 AM, Dominik Szczerba wrote:

So H5Fis_hdf5 can do some good for me here. But of course more general
handling of such cases in HDF would be most welcome as well.

    The more general solution will the the "crash proofing" work that
Elena posted about earlier. That will address these sort of issues, at
least for metadata modifications.

    Quincey

Thanks again,
Dominik

    Quincey

- Dominik

--Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to
hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Hi,
1) Is it possible to know the rank of the dataset by a single call to
H5LTget_dataset_info? I know this is possible with h5ltget_dataset_ndims_f,
but I believe it should is also possible without it (otherwise it is
returning a pointer to array of unknown size?)
2) is hsize_t* dims allocated by H5LTget_dataset_info or should be
preallocated? (this is a general remark, this kind of information is missing
on many manual pages).
thanks a lot and best regards,

···

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

Hello

Hi,
1) Is it possible to know the rank of the dataset by a single call to
H5LTget_dataset_info? I know this is possible with h5ltget_dataset_ndims_f,
but I believe it should is also possible without it (otherwise it is
returning a pointer to array of unknown size?)

the corresponding C function is H5LTget_dataset_ndims.
check out the manual page at

http://www.hdfgroup.uiuc.edu/HDF5/doc_1.8pre/doc/HL/RM_H5LT.html

2) is hsize_t* dims allocated by H5LTget_dataset_info or should be
preallocated? (this is a general remark, this kind of information is missing
on many manual pages).
thanks a lot and best regards,

it should be preallocated.

the typical usage is to get the rank with H5LTget_dataset_ndims, allocate a buffer with the rank and then call H5LTget_dataset_info

for example

int rank;
hsize_t *dims;

H5LTget_dataset_ndims (dset_id, "dset_name", &rank );

dims = malloc( rank );

H5LTget_dataset_info(file_id, "dset_name" ,dims,NULL,NULL);

free ( dims );

Pedro

--
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
http://www.vision.ee.ethz.ch/~domi

Pedro Vicente Nunes

···

At 07:01 AM 12/28/2007, Dominik Szczerba wrote:
--------------------------------------------------------------


HDF5 tools main developer
phone: (217)-265-0311
pvn@hdfgroup.org