Parallel I/O Compression

Hello,

I currently am writing collectively to an HDF5 file in parallel using
chunks, where each processor writes its subdomain as a chunk of a full
dataset. I have this working correctly using hyperslabs, however the file
size is very large [about 18x larger than if it was created using
sequential HDF5 and a H5Pset_deflate(plist_id,6)]. If I try to apply this
routine to the property list while performing parallel I/O, HDF5 says that
this feature is not yet supported (I am using v1.8.10). Is there any way to
compress the file during parallel write?

Thank you,
Rob

Hello,

I currently am writing collectively to an HDF5 file in parallel using chunks, where each processor writes its subdomain as a chunk of a full dataset. I have this working correctly using hyperslabs, however the file size is very large [about 18x larger than if it was created using sequential HDF5 and a H5Pset_deflate(plist_id,6)]. If I try to apply this routine to the property list while performing parallel I/O, HDF5 says that this feature is not yet supported (I am using v1.8.10). Is there any way to compress the file during parallel write?

This is rather a compressing issue than a HDF5 one:
you may look for parallel versions of current compressors (pigz, pbzip2, ...).

hth,
Jerome

···

On 13/01/13 16:38, Robert Seigel wrote:

Thank you,
Rob

This body part will be downloaded on demand.

Thank you for the response Jerome. Is this not an HDF5 issue because it is
not possible with HDF5? I would rather not have to compress the .h5 file
after it has been created.

Rob

···

On Sun, Jan 13, 2013 at 11:10 AM, Jerome BENOIT <g6299304p@rezozer.net>wrote:

On 13/01/13 16:38, Robert Seigel wrote:

Hello,

I currently am writing collectively to an HDF5 file in parallel using
chunks, where each processor writes its subdomain as a chunk of a full
dataset. I have this working correctly using hyperslabs, however the file
size is very large [about 18x larger than if it was created using
sequential HDF5 and a H5Pset_deflate(plist_id,6)]. If I try to apply this
routine to the property list while performing parallel I/O, HDF5 says that
this feature is not yet supported (I am using v1.8.10). Is there any way to
compress the file during parallel write?

This is rather a compressing issue than a HDF5 one:
you may look for parallel versions of current compressors (pigz, pbzip2,
...).

hth,
Jerome

Thank you,

Rob

This body part will be downloaded on demand.

______________________________**_________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/**mailman/listinfo/hdf-forum_**hdfgroup.org<http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org>

Hello,

Thank you for the response Jerome. Is this not an HDF5 issue because it is not possible with HDF5? I would rather not have to compress the .h5 file after it has been created.

HDF5 can compress data: there is a default compressor (gzip) and you can use your own one (through some code):
code examples can be found for bzip2.
If you are a confident C coder, you can easily implement xz compress, and you can certainly implement a parallel version
of those codes. (I use my own bzip2 and xz compress codes within HDF5, but I have not yet parallelized them
by lake of time)

I guess it is a bad idea to compress h5 files: it is better to compress within.
Note that you can drastically improve the compression rate by using properly
some filters on the data.

To the pigz and pbzip2, pxz can be added.

Jerome

···

On 13/01/13 18:37, Robert Seigel wrote:

Rob

On Sun, Jan 13, 2013 at 11:10 AM, Jerome BENOIT <g6299304p@rezozer.net <mailto:g6299304p@rezozer.net>> wrote:

    On 13/01/13 16:38, Robert Seigel wrote:

        Hello,

        I currently am writing collectively to an HDF5 file in parallel using chunks, where each processor writes its subdomain as a chunk of a full dataset. I have this working correctly using hyperslabs, however the file size is very large [about 18x larger than if it was created using sequential HDF5 and a H5Pset_deflate(plist_id,6)]. If I try to apply this routine to the property list while performing parallel I/O, HDF5 says that this feature is not yet supported (I am using v1.8.10). Is there any way to compress the file during parallel write?

    This is rather a compressing issue than a HDF5 one:
    you may look for parallel versions of current compressors (pigz, pbzip2, ...).

    hth,
    Jerome

        Thank you,
        Rob

        This body part will be downloaded on demand.

    _________________________________________________
    Hdf-forum is for HDF software users discussion.
    Hdf-forum@hdfgroup.org <mailto:Hdf-forum@hdfgroup.org>
    http://mail.hdfgroup.org/__mailman/listinfo/hdf-forum___hdfgroup.org <http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org>

Hi Robert and Jerome,

Sequential HDF5 library can write and read compressed data. Parallel HDF5 can read a compressed dataset using several processes, but cannot write to a compressed dataset.

Writing compressed data in parallel is a feature that is often requested, but unfortunately we do not have funding to implement it. But before (or actually after :wink: talking about funding, we really need to gather requirements for this feature.

All,

Enabling writing of compressed data in a parallel HDF5 library will require a lot of prototyping and a substantial development effort. We would like to hear from you if you think the feature is absolutely critical for your application. We also like to learn more about the writing patterns your application uses.

In Robert's example each process writes a chunk of an HDF5 dataset. This special case may be a little-bit easy to address than a general case when data from a chunk may be distributed among several processes. It would be good to know if this particular scenario is common. What are other commonly used I/O patterns?

Knowing more about the I/O patterns will help us to understand the approach we might take in going forward with the design and implementation of the feature of writing an HDF5 compressed dataset in parallel (and the cost, of course!)

Thank you!

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Jan 13, 2013, at 2:09 PM, Jerome BENOIT wrote:

Hello,

On 13/01/13 18:37, Robert Seigel wrote:

Thank you for the response Jerome. Is this not an HDF5 issue because it is not possible with HDF5? I would rather not have to compress the .h5 file after it has been created.

HDF5 can compress data: there is a default compressor (gzip) and you can use your own one (through some code):
code examples can be found for bzip2.
If you are a confident C coder, you can easily implement xz compress, and you can certainly implement a parallel version
of those codes. (I use my own bzip2 and xz compress codes within HDF5, but I have not yet parallelized them
by lake of time)

I guess it is a bad idea to compress h5 files: it is better to compress within.
Note that you can drastically improve the compression rate by using properly
some filters on the data.

To the pigz and pbzip2, pxz can be added.

Jerome

Rob

On Sun, Jan 13, 2013 at 11:10 AM, Jerome BENOIT <g6299304p@rezozer.net <mailto:g6299304p@rezozer.net>> wrote:

   On 13/01/13 16:38, Robert Seigel wrote:

       Hello,

       I currently am writing collectively to an HDF5 file in parallel using chunks, where each processor writes its subdomain as a chunk of a full dataset. I have this working correctly using hyperslabs, however the file size is very large [about 18x larger than if it was created using sequential HDF5 and a H5Pset_deflate(plist_id,6)]. If I try to apply this routine to the property list while performing parallel I/O, HDF5 says that this feature is not yet supported (I am using v1.8.10). Is there any way to compress the file during parallel write?

   This is rather a compressing issue than a HDF5 one:
   you may look for parallel versions of current compressors (pigz, pbzip2, ...).

   hth,
   Jerome

       Thank you,
       Rob

       This body part will be downloaded on demand.

   _________________________________________________
   Hdf-forum is for HDF software users discussion.
   Hdf-forum@hdfgroup.org <mailto:Hdf-forum@hdfgroup.org>
   http://mail.hdfgroup.org/__mailman/listinfo/hdf-forum___hdfgroup.org <http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org>

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Elena,

Thank you for the clarification. Following on from Jerome's suggestion of
installing my own compression algorithm (or library) within the routines
where I call HDF5, is it possible to use H5Zregister and H5Pset_filter to
define another compression type (e.g. bzip2)? Or would this still not work
in parallel? If not, I guess a possible solution (though not efficient)
would be to rewrite the file using compression after it has been created,
whereby each dataset is one chunk, assuming there is adequate memory to
hold each dataset without parallelization. Has anyone done something
similar to this?

As for your bigger question, I can give you some intel regarding
atmospheric models like the one I am currently working on. Generally, these
models use parallelization to break up three dimensional grids (x,y,z) into
subdomains of vertical columns, where every processor has its own portion
of the atmosphere (the vertical coordinate is not usually subdivided) that
can then be integrated in time. Every so often, the full grids need to be
written out for postprocessing and analysis (or, conversely, read in to the
model for a history restart, etc.). This is where most atmospheric models
would have a similar approach to what I am doing, where each subdomain
writes its own "chunk" of the atmosphere as a hyperslab of the larger
dataset. The number of datasets is usually large (my model has ~ 250 2D,
3D, and 4D domains that are subdivided in x and y). For large simulations
that require parallelization, each file size can be 10's of Gb, amounting
to many Tb's for one simulation even when compressed; so compression is
necessary! I hope this helps.

Thanks again,
Rob

···

On Sun, Jan 13, 2013 at 7:24 PM, Elena Pourmal <epourmal@hdfgroup.org>wrote:

Hi Robert and Jerome,

Sequential HDF5 library can write and read compressed data. Parallel HDF5
can read a compressed dataset using several processes, but cannot write to
a compressed dataset.

Writing compressed data in parallel is a feature that is often requested,
but unfortunately we do not have funding to implement it. But before (or
actually after :wink: talking about funding, we really need to gather
requirements for this feature.

All,

Enabling writing of compressed data in a parallel HDF5 library will
require a lot of prototyping and a substantial development effort. We would
like to hear from you if you think the feature is absolutely critical for
your application. We also like to learn more about the writing patterns
your application uses.

In Robert's example each process writes a chunk of an HDF5 dataset. This
special case may be a little-bit easy to address than a general case when
data from a chunk may be distributed among several processes. It would be
good to know if this particular scenario is common. What are other commonly
used I/O patterns?

Knowing more about the I/O patterns will help us to understand the
approach we might take in going forward with the design and implementation
of the feature of writing an HDF5 compressed dataset in parallel (and the
cost, of course!)

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Jan 13, 2013, at 2:09 PM, Jerome BENOIT wrote:

Hello,

On 13/01/13 18:37, Robert Seigel wrote:

Thank you for the response Jerome. Is this not an HDF5 issue because it is
not possible with HDF5? I would rather not have to compress the .h5 file
after it has been created.

HDF5 can compress data: there is a default compressor (gzip) and you can
use your own one (through some code):
code examples can be found for bzip2.
If you are a confident C coder, you can easily implement xz compress, and
you can certainly implement a parallel version
of those codes. (I use my own bzip2 and xz compress codes within HDF5, but
I have not yet parallelized them
by lake of time)

I guess it is a bad idea to compress h5 files: it is better to compress
within.
Note that you can drastically improve the compression rate by using
properly
some filters on the data.

To the pigz and pbzip2, pxz can be added.

Jerome

Rob

On Sun, Jan 13, 2013 at 11:10 AM, Jerome BENOIT <g6299304p@rezozer.net < > mailto:g6299304p@rezozer.net <g6299304p@rezozer.net>>> wrote:

   On 13/01/13 16:38, Robert Seigel wrote:

       Hello,

       I currently am writing collectively to an HDF5 file in parallel
using chunks, where each processor writes its subdomain as a chunk of a
full dataset. I have this working correctly using hyperslabs, however the
file size is very large [about 18x larger than if it was created using
sequential HDF5 and a H5Pset_deflate(plist_id,6)]. If I try to apply this
routine to the property list while performing parallel I/O, HDF5 says that
this feature is not yet supported (I am using v1.8.10). Is there any way to
compress the file during parallel write?

   This is rather a compressing issue than a HDF5 one:

   you may look for parallel versions of current compressors (pigz,
pbzip2, ...).

   hth,

   Jerome

       Thank you,

       Rob

       This body part will be downloaded on demand.

   _________________________________________________

   Hdf-forum is for HDF software users discussion.

   Hdf-forum@hdfgroup.org <mailto:Hdf-forum@hdfgroup.org<Hdf-forum@hdfgroup.org>
>

   http://mail.hdfgroup.org/__mailman/listinfo/hdf-forum___hdfgroup.org <
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org>

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

H5Zregister and its friends must be used.
I have just had a look to pxz (C) code which parallelizes xz via OMP: it sounds very easy:
I am quite sure that you can adapt the code to H5Zregister an OMP parallelized xz compressor.
Before using your xz filter, you may consider (and try) other filters.
You may compress within HDF5 not outside (namely, avoid to compress the h5 data file itself:
it may be less efficient).

My 2 cents,
Jerome

···

On 14/01/13 12:15, Robert Seigel wrote:

Thank you for the clarification. Following on from Jerome's suggestion of installing my own compression algorithm (or library) within the routines where I call HDF5, is it possible to use H5Zregister and H5Pset_filter to define another compression type (e.g. bzip2)? Or would this still not work in parallel? If not,

You may have a look here:

http://www.hdfgroup.org/services/filters.html

···

On 14/01/13 12:15, Robert Seigel wrote:

Hi Elena,

Thank you for the clarification. Following on from Jerome's suggestion of installing my own compression algorithm (or library) within the routines where I call HDF5, is it possible to use H5Zregister and H5Pset_filter to define another compression type (e.g. bzip2)? Or would this still not work in parallel? If not, I guess a possible solution (though not efficient) would be to rewrite the file using compression after it has been created, whereby each dataset is one chunk, assuming there is adequate memory to hold each dataset without parallelization. Has anyone done something similar to this?

As for your bigger question, I can give you some intel regarding atmospheric models like the one I am currently working on. Generally, these models use parallelization to break up three dimensional grids (x,y,z) into subdomains of vertical columns, where every processor has its own portion of the atmosphere (the vertical coordinate is not usually subdivided) that can then be integrated in time. Every so often, the full grids need to be written out for postprocessing and analysis (or, conversely, read in to the model for a history restart, etc.). This is where most atmospheric models would have a similar approach to what I am doing, where each subdomain writes its own "chunk" of the atmosphere as a hyperslab of the larger dataset. The number of datasets is usually large (my model has ~ 250 2D, 3D, and 4D domains that are subdivided in x and y). For large simulations that require parallelization, each file size can be 10's of Gb, amounting to many Tb's for one simulation even
when compressed; so compression is necessary! I hope this helps.

Thanks again,
Rob

On Sun, Jan 13, 2013 at 7:24 PM, Elena Pourmal <epourmal@hdfgroup.org <mailto:epourmal@hdfgroup.org>> wrote:

    Hi Robert and Jerome,

    Sequential HDF5 library can write and read compressed data. Parallel HDF5 can read a compressed dataset using several processes, but cannot write to a compressed dataset.

    Writing compressed data in parallel is a feature that is often requested, but unfortunately we do not have funding to implement it. But before (or actually after :wink: talking about funding, we really need to gather requirements for this feature.

    All,

    Enabling writing of compressed data in a parallel HDF5 library will require a lot of prototyping and a substantial development effort. We would like to hear from you if you think the feature is absolutely critical for your application. We also like to learn more about the writing patterns your application uses.

    In Robert's example each process writes a chunk of an HDF5 dataset. This special case may be a little-bit easy to address than a general case when data from a chunk may be distributed among several processes. It would be good to know if this particular scenario is common. What are other commonly used I/O patterns?

    Knowing more about the I/O patterns will help us to understand the approach we might take in going forward with the design and implementation of the feature of writing an HDF5 compressed dataset in parallel (and the cost, of course!)

    Thank you!

    Elena
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Elena Pourmal The HDF Group http://hdfgroup.org
    1800 So. Oak St., Suite 203, Champaign IL 61820
    217.531.6112 <tel:217.531.6112>
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    On Jan 13, 2013, at 2:09 PM, Jerome BENOIT wrote:

    Hello,

    On 13/01/13 18:37, Robert Seigel wrote:

    Thank you for the response Jerome. Is this not an HDF5 issue because it is not possible with HDF5? I would rather not have to compress the .h5 file after it has been created.

    HDF5 can compress data: there is a default compressor (gzip) and you can use your own one (through some code):
    code examples can be found for bzip2.
    If you are a confident C coder, you can easily implement xz compress, and you can certainly implement a parallel version
    of those codes. (I use my own bzip2 and xz compress codes within HDF5, but I have not yet parallelized them
    by lake of time)

    I guess it is a bad idea to compress h5 files: it is better to compress within.
    Note that you can drastically improve the compression rate by using properly
    some filters on the data.

    To the pigz and pbzip2, pxz can be added.

    Jerome

    Rob

    On Sun, Jan 13, 2013 at 11:10 AM, Jerome BENOIT <g6299304p@rezozer.net <mailto:g6299304p@rezozer.net> <mailto:g6299304p@rezozer.net>> wrote:

       On 13/01/13 16:38, Robert Seigel wrote:

           Hello,

           I currently am writing collectively to an HDF5 file in parallel using chunks, where each processor writes its subdomain as a chunk of a full dataset. I have this working correctly using hyperslabs, however the file size is very large [about 18x larger than if it was created using sequential HDF5 and a H5Pset_deflate(plist_id,6)]. If I try to apply this routine to the property list while performing parallel I/O, HDF5 says that this feature is not yet supported (I am using v1.8.10). Is there any way to compress the file during parallel write?

       This is rather a compressing issue than a HDF5 one:
       you may look for parallel versions of current compressors (pigz, pbzip2, ...).

       hth,
       Jerome

           Thank you,
           Rob

           This body part will be downloaded on demand.

       _________________________________________________
       Hdf-forum is for HDF software users discussion.
    Hdf-forum@hdfgroup.org <mailto:Hdf-forum@hdfgroup.org> <mailto:Hdf-forum@hdfgroup.org>
    http://mail.hdfgroup.org/__mailman/listinfo/hdf-forum___hdfgroup.org <http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org>

    _______________________________________________
    Hdf-forum is for HDF software users discussion.
    Hdf-forum@hdfgroup.org <mailto:Hdf-forum@hdfgroup.org>
    http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

    _______________________________________________
    Hdf-forum is for HDF software users discussion.
    Hdf-forum@hdfgroup.org <mailto:Hdf-forum@hdfgroup.org>
    http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

This body part will be downloaded on demand.

Hi Elena,
For our direct numerical simulations of compressible and hypersonic
turbulence we take a similar domain decomposition approach to Rob: The 3-D
spatial data array is subdivided over a 2-D domain decomposition across the
computational elements. Each element receives a wall-normal column of data
that is i by j by k in dimension. If I, J, K are the dimensions of the
entire flow field then i ~ I/N, j ~ J/M and k = K. Here K is the index in
the wall normal coordinate.

The wall normal direction is the stiffest direction, and sometimes it will
be treated implicitly while the other two directions are treated
explicitly. When a strong radiating shock layer is added, then additional
radiative physics need to be added to each of these columns, which might be
implemented in an embarrassingly parallel fashion depending on how you
model radiation.

As far as chunks go, right now I am not chunking the data set, which is
probably sub-optimal. It's just stored as one large block of I-J-K
(fortran/column major) ordered data. The sims create a TON of data, and I
haven't been able to decide if I want to optimize the IO for writing during
the simulation, or for various post processing tasks. The issue here is
that, in much of the statistical post processing the data can be reduced
along the statistically homogeneous directions: time and the J index,
before proceeding further.

At any rate, parallel compression would force me to commit to a chunking
scheme, and help ease our storage pains, which are numerous.

Izaak Beekman

···

===================================
(301)244-9367
UMD-CP Visiting Graduate Student
Aerospace Engineering
ibeekman@umiacs.umd.edu
ibeekman@umd.edu

Hi Elena,
For our direct numerical simulations of compressible and hypersonic
turbulence we take a similar domain decomposition approach to Rob: The 3-D
spatial data array is subdivided over a 2-D domain decomposition across the
computational elements. Each element receives a wall-normal column of data
that is i by j by k in dimension. If I, J, K are the dimensions of the
entire flow field then i ~ I/N, j ~ J/M and k = K. Here K is the index in
the wall normal coordinate.

The wall normal direction is the stiffest direction, and sometimes it will
be treated implicitly while the other two directions are treated
explicitly. When a strong radiating shock layer is added, then additional
radiative physics need to be added to each of these columns, which might be
implemented in an embarrassingly parallel fashion depending on how you
model radiation.

As far as chunks go, right now I am not chunking the data set, which is
probably sub-optimal. It's just stored as one large block of I-J-K
(fortran/column major) ordered data. The sims create a TON of data, and I
haven't been able to decide if I want to optimize the IO for writing during
the simulation, or for various post processing tasks. The issue here is
that, in much of the statistical post processing the data can be reduced
along the statistically homogeneous directions: time and the J index,
before proceeding further.

At any rate, parallel compression would force me to commit to a chunking
scheme, and help ease our storage pains, which are numerous.

Izaak Beekman

···

===================================
(301)244-9367
UMD-CP Visiting Graduate Student
Aerospace Engineering
ibeekman@umiacs.umd.edu
ibeekman@umd.edu