Asynchronous I/O

As a total newbie to HDF, I am interested in knowing whether this format
supports asynchronous I/O and has calls to synchronize read/write aheads?

Thank you!

Naomi Greenberg

Principal Member of the Research Staff

<http://www.riversideresearch.org/&gt; www.riversideresearch.org

T: 212.502.1718 | F: 212.502.1729 |

line

Riverside Research | 156 William Street | New York, N.Y. 10038

<http://twitter.com/RiversideRsch&gt; Follow us on Twitter
<http://www.linkedin.com/company/riverside-research&gt; Join us on LinkedIn
<http://www.riversideresearch.org/rss.xml&gt; Subscribe to our feed

Hi,

The HDF5 library does not support asynchronous I/O at this time. We are looking into including async I/O support in a future release, however.

If you'd like to hurry this work along with financial support, there are people here you should talk to :slight_smile:

Dana

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Greenberg, Naomi
Sent: Tuesday, May 27, 2014 12:40 PM
To: Hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] Asynchronous I/O

As a total newbie to HDF, I am interested in knowing whether this format supports asynchronous I/O and has calls to synchronize read/write aheads?
Thank you!

Naomi Greenberg
Principal Member of the Research Staff
www.riversideresearch.org<http://www.riversideresearch.org/&gt;
T: 212.502.1718 | F: 212.502.1729 |
[line]
Riverside Research | 156 William Street | New York, N.Y. 10038

[Follow us on Twitter]<http://twitter.com/RiversideRsch&gt;\[Join us on LinkedIn]<http://www.linkedin.com/company/riverside-research&gt;\[Subscribe to our feed]<http://www.riversideresearch.org/rss.xml&gt;

Hi,

The HDF5 library does not support asynchronous I/O at this time. We are looking into including async I/O support in a future release, however.

I've attached a document that describes our current ideas in this space.

Quincey

AsyncIO-MultiThreadingHDF5.docx (30.3 KB)

ATT00001.htm (3.21 KB)

···

On May 27, 2014, at 2:43 PM, Dana Robinson <derobins@hdfgroup.org<mailto:derobins@hdfgroup.org>> wrote:

Hi,

The HDF5 library does not support asynchronous I/O at this time. We
are looking into including async I/O support in a future release, however.

I've attached a document that describes our current ideas in this space.

Good read. Just how compute bound is HDF5, anyway? I'm always living in a land of large datasets, where library overhead is dwarfed by the I/O workload overhead.

you did not mention the multi-dataset I/O approach: it's a half-step towards asynchronism -- or maybe a half-step backwards -- in that instead of decoupling the description of the data with the execution of the data, HDF5's multi-dataset routines will describe more data in a single call.

I don't think the global HDF5 lock precludes an async approach. Probably this async facility should exist on top of HDF5, though, and can provide the caching, read-ahead, coalescing, and other benefits while leaving the bulk of the 300k lines of C code untouched. In my head it's MPI_THREAD_FUNELED for HDF5.

The various ways one can manage MPI progress are instructive here.

==rob

···

On 05/27/2014 02:45 PM, Quincey Koziol wrote:

On May 27, 2014, at 2:43 PM, Dana Robinson <derobins@hdfgroup.org > <mailto:derobins@hdfgroup.org>> wrote:

Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

27.05.2014 23:45, Quincey Koziol пишет:

Hi,

The HDF5 library does not support asynchronous I/O at this time. We
are looking into including async I/O support in a future release, however.

My experience suggests that it's probably enough if only the following procedures do not trigger Global Lock, or have async equivalents:

1) H5Dread
2) H5Dwrite
3) H5Ocopy

Best wishes,
Andrey Paramonov

···

On May 27, 2014, at 2:43 PM, Dana Robinson <derobins@hdfgroup.org > <mailto:derobins@hdfgroup.org>> wrote:

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Hi Rob,

Hi,

The HDF5 library does not support asynchronous I/O at this time. We
are looking into including async I/O support in a future release, however.

I've attached a document that describes our current ideas in this space.

Good read. Just how compute bound is HDF5, anyway? I'm always living in a land of large datasets, where library overhead is dwarfed by the I/O workload overhead.

  Generally speaking, HDF5 is not compute bound. It's only when an application asks for a compute-oriented task that something could be expensive (datatype conversion, compression, etc).

you did not mention the multi-dataset I/O approach: it's a half-step towards asynchronism -- or maybe a half-step backwards -- in that instead of decoupling the description of the data with the execution of the data, HDF5's multi-dataset routines will describe more data in a single call.

  I think multi-dataset reads/writes are neutral on the asynchrony axis - a multi-dataset I/O operation could be made asynchronous in the same way as any other operation that touches the file.

I don't think the global HDF5 lock precludes an async approach. Probably this async facility should exist on top of HDF5, though, and can provide the caching, read-ahead, coalescing, and other benefits while leaving the bulk of the 300k lines of C code untouched. In my head it's MPI_THREAD_FUNNELED for HDF5.

  I think there's actually a good case for pushing a portion of the asynchrony inside they HDF5 library, since it allows existing applications (which aren't using async I/O variants of the API routines) to get the benefit of asynchronous metadata operations. (ie. flushing dirty metadata to the file in the background)

The various ways one can manage MPI progress are instructive here.

  Indeed. :slight_smile:

  Thanks for the feedback,
    Quincey

···

On May 28, 2014, at 4:22 PM, Rob Latham <robl@mcs.anl.gov> wrote:

On 05/27/2014 02:45 PM, Quincey Koziol wrote:

On May 27, 2014, at 2:43 PM, Dana Robinson <derobins@hdfgroup.org >> <mailto:derobins@hdfgroup.org>> wrote:

==rob

Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5