HDF5 chunking mechanism

Mark_Howison1 · June 11, 2009, 3:34pm

Hi all,

I'm trying to write out a 512^3 float32 dataset using 64^3 chunk
dimensions. I'm seeing the H5write call break into many 256 byte write
calls (64 floats) instead of what I expected: fewer 1048576 byte write
calls (64^3 floats). Before I delved into this further, I just wanted
to verify that HDF5 should be using the 1048576 byte writes in this
particular chunking case. This is a new case for me, because in the
past I've almost always had the x-dimension for the chunks equal to
the x-dimension of the hyperslab.

Thanks,
Mark

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Francesc_Alted2 · June 12, 2009, 8:52am

A Thursday 11 June 2009 17:34:46 Mark Howison escrigué:

Hi all,

I'm trying to write out a 512^3 float32 dataset using 64^3 chunk
dimensions. I'm seeing the H5write call break into many 256 byte write
calls (64 floats) instead of what I expected: fewer 1048576 byte write
calls (64^3 floats). Before I delved into this further, I just wanted
to verify that HDF5 should be using the 1048576 byte writes in this
particular chunking case. This is a new case for me, because in the
past I've almost always had the x-dimension for the chunks equal to
the x-dimension of the hyperslab.

Mmm, that should not happen. How are you guessing that H5write calls break
into many 256 byte writes? If you can provide an example of your code perhaps
we would be able to help you more.

Cheers,

···

--
Francesc Alted

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Mark_Howison · June 15, 2009, 2:50pm

Hi Francesc,

I'm detecting the 256 writes using POSIX traces from a performance
profiling tool called IPM that we use at NERSC:

http://www.nersc.gov/nusers/systems/franklin/tools.php#ipm

I attached the C code I am using, but it uses H5Part, a veneer API we
have built on top of HDF5. I'll try to breakdown the H5Part calls in
terms of their corresponding HDF5 functionality:

H5PartOpenFileParallelAlign - opens the file with the MPI-POSIX VFD
and sets the alignment to 1MB using the H5Pset_alignment function

H5PartSetStep - creates an HDF5 group called "/Step#N"

H5BlockDefine3DChunk - stores the chunk dimensions in an H5Part data structure

H5BlockDefine3DLayout - creates HDF5 dataspaces

H5BlockWriteFieldFloat32 - creates a group off of "/Step#N" and a 3D
dataset, with the chunk dimensions in a creation property list, then
calls H5Write using the defined dataspaces

If you'd like to browse the source code for H5Part, you can find it at:

https://codeforge.lbl.gov/projects/h5part/

Thanks,
Mark

convert-mpi.c (7.26 KB)

···

On Fri, Jun 12, 2009 at 1:52 AM, Francesc Alted<faltet@pytables.org> wrote:

A Thursday 11 June 2009 17:34:46 Mark Howison escrigué:

Hi all,

I'm trying to write out a 512^3 float32 dataset using 64^3 chunk
dimensions. I'm seeing the H5write call break into many 256 byte write
calls (64 floats) instead of what I expected: fewer 1048576 byte write
calls (64^3 floats). Before I delved into this further, I just wanted
to verify that HDF5 should be using the 1048576 byte writes in this
particular chunking case. This is a new case for me, because in the
past I've almost always had the x-dimension for the chunks equal to
the x-dimension of the hyperslab.

Mmm, that should not happen. How are you guessing that H5write calls break
into many 256 byte writes? If you can provide an example of your code perhaps
we would be able to help you more.

Cheers,

--
Francesc Alted

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Francesc_Alted2 · June 15, 2009, 4:39pm

A Monday 15 June 2009 16:50:44 Mark Howison escrigué:

Hi Francesc,

I'm detecting the 256 writes using POSIX traces from a performance
profiling tool called IPM that we use at NERSC:

http://www.nersc.gov/nusers/systems/franklin/tools.php#ipm

I attached the C code I am using, but it uses H5Part, a veneer API we
have built on top of HDF5. I'll try to breakdown the H5Part calls in
terms of their corresponding HDF5 functionality:

H5PartOpenFileParallelAlign - opens the file with the MPI-POSIX VFD
and sets the alignment to 1MB using the H5Pset_alignment function

H5PartSetStep - creates an HDF5 group called "/Step#N"

H5BlockDefine3DChunk - stores the chunk dimensions in an H5Part data
structure

H5BlockDefine3DLayout - creates HDF5 dataspaces

H5BlockWriteFieldFloat32 - creates a group off of "/Step#N" and a 3D
dataset, with the chunk dimensions in a creation property list, then
calls H5Write using the defined dataspaces

Ok. But an auto-contained file that does the same thing, but using the plain
HDF5 C-API would be far easier for us to look at. Could you provide this
please?

···

--
Francesc Alted

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

HDF5 chunking mechanism