Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing results
would be helpful.

Thanks.

- Jonathan

···

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Mark,

h5dump performance may be affected by many factors (size of the h5dump default read buffer, chunking sizes of the dataset, compression, etc.)
Would it be possible for you to do h5dump -p -H -d .... to print the header information for the dataset you are trying to export? We may have a better idea what may go wrong.

Thank you!

Elena

···

On Jan 12, 2010, at 12:15 PM, Jonathan Kim wrote:

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing results
would be helpful.

Thanks.

- Jonathan

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

I tried using BE and LE and both are equally slow. Here is the header
info. Also, I should note that the dataset is roughly 108GB, but it
does fit into local memory (196GB is available). Also, it seems to
continuously write at 4MB/s, instead of sitting and processing for a
while and then bursting at 100MB/s or something. It is also chunked.
Maybe this is causing problems, because h5dump has to jump around to
non-contiguous offsets to contiguously assemble the binary output?

Thanks,
Mark

mhowison@davinci:/project/projectdirs/vacet/mark> h5dump -p -H -d
/Step#0/Block/Analyze7.5/0 combustion.h5part
HDF5 "combustion.h5part" {
DATASET "/Step#0/Block/Analyze7.5/0" {
   DATATYPE H5T_IEEE_F32LE
   DATASPACE SIMPLE { ( 3072, 3072, 3072 ) / ( 3072, 3072, 3072 ) }
   STORAGE_LAYOUT {
      CHUNKED ( 1024, 768, 768 )
      SIZE 115964116992
    }
   FILTERS {
      NONE
   }
   FILLVALUE {
      FILL_TIME H5D_FILL_TIME_IFSET
      VALUE 0
   }
   ALLOCATION_TIME {
      H5D_ALLOC_TIME_EARLY
   }
}
}

···

On Tue, Jan 12, 2010 at 3:40 PM, Elena Pourmal <epourmal@hdfgroup.org> wrote:

Mark,

h5dump performance may be affected by many factors (size of the h5dump default read buffer, chunking sizes of the dataset, compression, etc.)
Would it be possible for you to do h5dump -p -H -d .... to print the header information for the dataset you are trying to export? We may have a better idea what may go wrong.

Thank you!

Elena
On Jan 12, 2010, at 12:15 PM, Jonathan Kim wrote:

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing results
would be helpful.

Thanks.

- Jonathan

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Mark,

Thank you for the output! We will need to look more closely at what library and h5dump are doing in order to give the correct answer. I will enter an enhancement report to our issues database.

For now... If you can recompile the source code, please try to change the size of H5TOOLS_BUFSIZE (h5tools.h file in the tools/lib directory); make it at least 4MB. It may help with "jumping around" and with writing the binary output file.

Elena

···

On Jan 13, 2010, at 10:30 AM, Mark Howison wrote:

I tried using BE and LE and both are equally slow. Here is the header
info. Also, I should note that the dataset is roughly 108GB, but it
does fit into local memory (196GB is available). Also, it seems to
continuously write at 4MB/s, instead of sitting and processing for a
while and then bursting at 100MB/s or something. It is also chunked.
Maybe this is causing problems, because h5dump has to jump around to
non-contiguous offsets to contiguously assemble the binary output?

Thanks,
Mark

mhowison@davinci:/project/projectdirs/vacet/mark> h5dump -p -H -d
/Step#0/Block/Analyze7.5/0 combustion.h5part
HDF5 "combustion.h5part" {
DATASET "/Step#0/Block/Analyze7.5/0" {
  DATATYPE H5T_IEEE_F32LE
  DATASPACE SIMPLE { ( 3072, 3072, 3072 ) / ( 3072, 3072, 3072 ) }
  STORAGE_LAYOUT {
     CHUNKED ( 1024, 768, 768 )
     SIZE 115964116992
   }
  FILTERS {
     NONE
  }
  FILLVALUE {
     FILL_TIME H5D_FILL_TIME_IFSET
     VALUE 0
  }
  ALLOCATION_TIME {
     H5D_ALLOC_TIME_EARLY
  }
}
}

On Tue, Jan 12, 2010 at 3:40 PM, Elena Pourmal <epourmal@hdfgroup.org> wrote:

Mark,

h5dump performance may be affected by many factors (size of the h5dump default read buffer, chunking sizes of the dataset, compression, etc.)
Would it be possible for you to do h5dump -p -H -d .... to print the header information for the dataset you are trying to export? We may have a better idea what may go wrong.

Thank you!

Elena
On Jan 12, 2010, at 12:15 PM, Jonathan Kim wrote:

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing results
would be helpful.

Thanks.

- Jonathan

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Mark,

the default buffer size used in h5dump is 1MB. Setting it to 4MB will improve
the performance but it may be still slow because the buffer size is much less
than the chunk size.

If you use h5repack to change the chunk size to (1x64x3072, a little less than 1MB)
and try h5dump again, you will see the difference.

Thanks
--pc

Mark Howison wrote:

···

I tried using BE and LE and both are equally slow. Here is the header
info. Also, I should note that the dataset is roughly 108GB, but it
does fit into local memory (196GB is available). Also, it seems to
continuously write at 4MB/s, instead of sitting and processing for a
while and then bursting at 100MB/s or something. It is also chunked.
Maybe this is causing problems, because h5dump has to jump around to
non-contiguous offsets to contiguously assemble the binary output?

Thanks,
Mark

mhowison@davinci:/project/projectdirs/vacet/mark> h5dump -p -H -d
/Step#0/Block/Analyze7.5/0 combustion.h5part
HDF5 "combustion.h5part" {
DATASET "/Step#0/Block/Analyze7.5/0" {
   DATATYPE H5T_IEEE_F32LE
   DATASPACE SIMPLE { ( 3072, 3072, 3072 ) / ( 3072, 3072, 3072 ) }
   STORAGE_LAYOUT {
      CHUNKED ( 1024, 768, 768 )
      SIZE 115964116992
    }
   FILTERS {
      NONE
   }
   FILLVALUE {
      FILL_TIME H5D_FILL_TIME_IFSET
      VALUE 0
   }
   ALLOCATION_TIME {
      H5D_ALLOC_TIME_EARLY
   }
}

On Tue, Jan 12, 2010 at 3:40 PM, Elena Pourmal <epourmal@hdfgroup.org> wrote:
  

Mark,

h5dump performance may be affected by many factors (size of the h5dump default read buffer, chunking sizes of the dataset, compression, etc.)
Would it be possible for you to do h5dump -p -H -d .... to print the header information for the dataset you are trying to export? We may have a better idea what may go wrong.

Thank you!

Elena
On Jan 12, 2010, at 12:15 PM, Jonathan Kim wrote:

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing results
would be helpful.

Thanks.

- Jonathan

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
      

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Peter and Elena,

I agree that the problem is most likely the chunking. I tried
repacking the dataset, but this was as slow as running h5dump on the
chunked dataset (probably for the same reason: non-contiguous disk
access on Lustre). So I regenerated my dataset without chunking, and I
was able to see about 10x better throughput with h5dump to binary (the
bottleneck at this point is probably in Lustre and not in h5dump). I
also tried increasing the buffer size to 4MB but found the effect was
negligible.

Thanks for your help,
Mark

···

On Wed, Jan 13, 2010 at 2:56 PM, Peter Cao <xcao@hdfgroup.org> wrote:

Mark,

the default buffer size used in h5dump is 1MB. Setting it to 4MB will
improve
the performance but it may be still slow because the buffer size is much
less
than the chunk size.

If you use h5repack to change the chunk size to (1x64x3072, a little less
than 1MB)
and try h5dump again, you will see the difference.

Thanks
--pc

Mark Howison wrote:

I tried using BE and LE and both are equally slow. Here is the header
info. Also, I should note that the dataset is roughly 108GB, but it
does fit into local memory (196GB is available). Also, it seems to
continuously write at 4MB/s, instead of sitting and processing for a
while and then bursting at 100MB/s or something. It is also chunked.
Maybe this is causing problems, because h5dump has to jump around to
non-contiguous offsets to contiguously assemble the binary output?

Thanks,
Mark

mhowison@davinci:/project/projectdirs/vacet/mark> h5dump -p -H -d
/Step#0/Block/Analyze7.5/0 combustion.h5part
HDF5 "combustion.h5part" {
DATASET "/Step#0/Block/Analyze7.5/0" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 3072, 3072, 3072 ) / ( 3072, 3072, 3072 ) }
STORAGE_LAYOUT {
CHUNKED ( 1024, 768, 768 )
SIZE 115964116992
}
FILTERS {
NONE
}
FILLVALUE {
FILL_TIME H5D_FILL_TIME_IFSET
VALUE 0
}
ALLOCATION_TIME {
H5D_ALLOC_TIME_EARLY
}
}
}

On Tue, Jan 12, 2010 at 3:40 PM, Elena Pourmal <epourmal@hdfgroup.org> >> wrote:

Mark,

h5dump performance may be affected by many factors (size of the h5dump
default read buffer, chunking sizes of the dataset, compression, etc.)
Would it be possible for you to do h5dump -p -H -d .... to print the
header information for the dataset you are trying to export? We may have a
better idea what may go wrong.

Thank you!

Elena
On Jan 12, 2010, at 12:15 PM, Jonathan Kim wrote:

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing
results
would be helpful.

Thanks.

- Jonathan

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org
[mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Mark,

Hi Peter and Elena,

I agree that the problem is most likely the chunking. I tried
repacking the dataset, but this was as slow as running h5dump on the
chunked dataset (probably for the same reason: non-contiguous disk
access on Lustre). So I regenerated my dataset without chunking, and I
was able to see about 10x better throughput with h5dump to binary (the
bottleneck at this point is probably in Lustre and not in h5dump). I
also tried increasing the buffer size to 4MB but found the effect was
negligible.

Thank you for trying a bigger buffer. It is good to know the result.

Elena

···

On Jan 15, 2010, at 10:26 AM, Mark Howison wrote:

Thanks for your help,
Mark

On Wed, Jan 13, 2010 at 2:56 PM, Peter Cao <xcao@hdfgroup.org> wrote:

Mark,

the default buffer size used in h5dump is 1MB. Setting it to 4MB will
improve
the performance but it may be still slow because the buffer size is much
less
than the chunk size.

If you use h5repack to change the chunk size to (1x64x3072, a little less
than 1MB)
and try h5dump again, you will see the difference.

Thanks
--pc

Mark Howison wrote:

I tried using BE and LE and both are equally slow. Here is the header
info. Also, I should note that the dataset is roughly 108GB, but it
does fit into local memory (196GB is available). Also, it seems to
continuously write at 4MB/s, instead of sitting and processing for a
while and then bursting at 100MB/s or something. It is also chunked.
Maybe this is causing problems, because h5dump has to jump around to
non-contiguous offsets to contiguously assemble the binary output?

Thanks,
Mark

mhowison@davinci:/project/projectdirs/vacet/mark> h5dump -p -H -d
/Step#0/Block/Analyze7.5/0 combustion.h5part
HDF5 "combustion.h5part" {
DATASET "/Step#0/Block/Analyze7.5/0" {
  DATATYPE H5T_IEEE_F32LE
  DATASPACE SIMPLE { ( 3072, 3072, 3072 ) / ( 3072, 3072, 3072 ) }
  STORAGE_LAYOUT {
     CHUNKED ( 1024, 768, 768 )
     SIZE 115964116992
   }
  FILTERS {
     NONE
  }
  FILLVALUE {
     FILL_TIME H5D_FILL_TIME_IFSET
     VALUE 0
  }
  ALLOCATION_TIME {
     H5D_ALLOC_TIME_EARLY
  }
}
}

On Tue, Jan 12, 2010 at 3:40 PM, Elena Pourmal <epourmal@hdfgroup.org> >>> wrote:

Mark,

h5dump performance may be affected by many factors (size of the h5dump
default read buffer, chunking sizes of the dataset, compression, etc.)
Would it be possible for you to do h5dump -p -H -d .... to print the
header information for the dataset you are trying to export? We may have a
better idea what may go wrong.

Thank you!

Elena
On Jan 12, 2010, at 12:15 PM, Jonathan Kim wrote:

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing
results
would be helpful.

Thanks.

- Jonathan

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org
[mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Mark,

Did you try to regenerate the dataset with chunk size of 1x64x3072?
I believe the performance should be close to case of without chunking.

Thanks
--pc

Mark Howison wrote:

···

Hi Peter and Elena,

I agree that the problem is most likely the chunking. I tried
repacking the dataset, but this was as slow as running h5dump on the
chunked dataset (probably for the same reason: non-contiguous disk
access on Lustre). So I regenerated my dataset without chunking, and I
was able to see about 10x better throughput with h5dump to binary (the
bottleneck at this point is probably in Lustre and not in h5dump). I
also tried increasing the buffer size to 4MB but found the effect was
negligible.

Thanks for your help,
Mark

On Wed, Jan 13, 2010 at 2:56 PM, Peter Cao <xcao@hdfgroup.org> wrote:
  

Mark,

the default buffer size used in h5dump is 1MB. Setting it to 4MB will
improve
the performance but it may be still slow because the buffer size is much
less
than the chunk size.

If you use h5repack to change the chunk size to (1x64x3072, a little less
than 1MB)
and try h5dump again, you will see the difference.

Thanks
--pc

Mark Howison wrote:
    

I tried using BE and LE and both are equally slow. Here is the header
info. Also, I should note that the dataset is roughly 108GB, but it
does fit into local memory (196GB is available). Also, it seems to
continuously write at 4MB/s, instead of sitting and processing for a
while and then bursting at 100MB/s or something. It is also chunked.
Maybe this is causing problems, because h5dump has to jump around to
non-contiguous offsets to contiguously assemble the binary output?

Thanks,
Mark

mhowison@davinci:/project/projectdirs/vacet/mark> h5dump -p -H -d
/Step#0/Block/Analyze7.5/0 combustion.h5part
HDF5 "combustion.h5part" {
DATASET "/Step#0/Block/Analyze7.5/0" {
  DATATYPE H5T_IEEE_F32LE
  DATASPACE SIMPLE { ( 3072, 3072, 3072 ) / ( 3072, 3072, 3072 ) }
  STORAGE_LAYOUT {
     CHUNKED ( 1024, 768, 768 )
     SIZE 115964116992
   }
  FILTERS {
     NONE
  }
  FILLVALUE {
     FILL_TIME H5D_FILL_TIME_IFSET
     VALUE 0
  }
  ALLOCATION_TIME {
     H5D_ALLOC_TIME_EARLY
  }
}

On Tue, Jan 12, 2010 at 3:40 PM, Elena Pourmal <epourmal@hdfgroup.org> >>> wrote:

Mark,

h5dump performance may be affected by many factors (size of the h5dump
default read buffer, chunking sizes of the dataset, compression, etc.)
Would it be possible for you to do h5dump -p -H -d .... to print the
header information for the dataset you are trying to export? We may have a
better idea what may go wrong.

Thank you!

Elena
On Jan 12, 2010, at 12:15 PM, Jonathan Kim wrote:

Hi,

It's different from h5copy.

What the original data format? BE or LE?
If it's BE, could you try -b BE and see if any performance difference?

And could you try with smaller size of HDF5 file? (under 10GB)

Also could you try on other filesystems? (non-parallel as well)

Since it's the performance issue not a specific bug, more testing
results
would be helpful.

Thanks.

- Jonathan

-----Original Message-----
From: hdf-forum-bounces@hdfgroup.org
[mailto:hdf-forum-bounces@hdfgroup.org]
On Behalf Of Mark Howison
Sent: Tuesday, January 12, 2010 10:26 AM
To: HDF forum
Subject: [Hdf-forum] Slow conversion to binary using h5dump

Hi, I tried converting a 108GB HDF5 file to binary using the "-b LE"
flag in h5dump, but it ran at a crawling pace, only about 4MB/s. This
is in comparison to an h5copy I did on the same machine (our SGI
Altix) that ran at 600MB/s. The filesystem is GPFS. Any ideas why
h5dump is having so much trouble? Is there a conversion phase (to LE)
that is bogging things down? Thanks, Mark

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org