SWMR slow call to dset.id.refresh()

Aardeagle · October 20, 2017, 8:47pm

Hi,

I am experiencing some major performance issues in SWMR mode when a reader refreshes the metadata in a file.

I’m testing I/O performance on our data systems using SWMR and h5py. I have a process writing out random data to a single dataset, which is then read by a single reader. The files are striped across a Lustre file system with three OSTs.

In a nutshell, the reader process refreshes the metadata, loops through and reads the dataset until it runs out of data, then refreshes the metadata again, loops through the new data and so on.

Refreshing the metadata with dset.id.refresh() lags both the reader and writer processes for several seconds. This gets worse the more data that is refreshed (usually several gb at a time).

Please see the attached image for plots of the reading/writing rates to disk. The reading in this test case is slightly faster than the writing. Eventually the reader catches up to the writer and tries to call dset.id.refresh() on each loop iteration. At this point the I/O gets gridlocked and comes to a near standstill.

Thanks,
Eliseo

koziol · October 23, 2017, 2:13pm

Hi Eliseo,
This is very useful and interesting data, thanks for providing it. When the I/O gets gridlocked at the end, are you in a tight loop polling the file, or do you wait between poll operations? Are you able to share your programs? In a couple of months, I’ll be working on some improvements to the performance and memory for getting data from a writer to a reader and it would be good to have test code like this to work with.

Quincey

···

On Oct 20, 2017, at 3:47 PM, Eliseo Gamboa <eliseo@slac.stanford.edu> wrote:

Hi,

I am experiencing some major performance issues in SWMR mode when a reader refreshes the metadata in a file.

I’m testing I/O performance on our data systems using SWMR and h5py. I have a process writing out random data to a single dataset, which is then read by a single reader. The files are striped across a Lustre file system with three OSTs.

In a nutshell, the reader process refreshes the metadata, loops through and reads the dataset until it runs out of data, then refreshes the metadata again, loops through the new data and so on.

Refreshing the metadata with dset.id.refresh() lags both the reader and writer processes for several seconds. This gets worse the more data that is refreshed (usually several gb at a time).

Please see the attached image for plots of the reading/writing rates to disk. The reading in this test case is slightly faster than the writing. Eventually the reader catches up to the writer and tries to call dset.id.refresh() on each loop iteration. At this point the I/O gets gridlocked and comes to a near standstill.

Thanks,
Eliseo

<PastedGraphic-1.tiff>

Thanks,
Eliseo
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Aardeagle · October 23, 2017, 5:51pm

Hi Quincey,

Here is a minimal working example

I’ve tested this on our flash-based filesystem. Note that the reader and writer processes have to be run on different machines. Otherwise the reader just grabs the data from the memory buffer.

When the I/O gets gridlocked, I wait 50 ms between refreshes.

Thanks,
Eliseo

···

On Oct 23, 2017, at 7:13 AM, Quincey Koziol <koziol@lbl.gov> wrote:

Hi Eliseo,
This is very useful and interesting data, thanks for providing it. When the I/O gets gridlocked at the end, are you in a tight loop polling the file, or do you wait between poll operations? Are you able to share your programs? In a couple of months, I’ll be working on some improvements to the performance and memory for getting data from a writer to a reader and it would be good to have test code like this to work with.

Quincey

On Oct 20, 2017, at 3:47 PM, Eliseo Gamboa <eliseo@slac.stanford.edu> wrote:

Hi,

I am experiencing some major performance issues in SWMR mode when a reader refreshes the metadata in a file.

I’m testing I/O performance on our data systems using SWMR and h5py. I have a process writing out random data to a single dataset, which is then read by a single reader. The files are striped across a Lustre file system with three OSTs.

In a nutshell, the reader process refreshes the metadata, loops through and reads the dataset until it runs out of data, then refreshes the metadata again, loops through the new data and so on.

Refreshing the metadata with dset.id.refresh() lags both the reader and writer processes for several seconds. This gets worse the more data that is refreshed (usually several gb at a time).

Please see the attached image for plots of the reading/writing rates to disk. The reading in this test case is slightly faster than the writing. Eventually the reader catches up to the writer and tries to call dset.id.refresh() on each loop iteration. At this point the I/O gets gridlocked and comes to a near standstill.

Thanks,
Eliseo

<PastedGraphic-1.tiff>

Thanks,
Eliseo
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

koziol · October 23, 2017, 8:16pm

Hi Eliseo,
Super, thanks! I’ll pull it down and test with it, when I get closer to having the feature ready to work on.

Quincey

···

On Oct 23, 2017, at 12:51 PM, Eliseo Gamboa <eliseo@slac.stanford.edu> wrote:

Hi Quincey,

Here is a minimal working example
https://github.com/slac-lcls/lcls2/tree/master/evtbuild/min_swmr_example

I’ve tested this on our flash-based filesystem. Note that the reader and writer processes have to be run on different machines. Otherwise the reader just grabs the data from the memory buffer.

When the I/O gets gridlocked, I wait 50 ms between refreshes.

Thanks,
Eliseo

On Oct 23, 2017, at 7:13 AM, Quincey Koziol <koziol@lbl.gov> wrote:

Hi Eliseo,
This is very useful and interesting data, thanks for providing it. When the I/O gets gridlocked at the end, are you in a tight loop polling the file, or do you wait between poll operations? Are you able to share your programs? In a couple of months, I’ll be working on some improvements to the performance and memory for getting data from a writer to a reader and it would be good to have test code like this to work with.

Quincey

On Oct 20, 2017, at 3:47 PM, Eliseo Gamboa <eliseo@slac.stanford.edu> wrote:

Hi,

I am experiencing some major performance issues in SWMR mode when a reader refreshes the metadata in a file.

I’m testing I/O performance on our data systems using SWMR and h5py. I have a process writing out random data to a single dataset, which is then read by a single reader. The files are striped across a Lustre file system with three OSTs.

In a nutshell, the reader process refreshes the metadata, loops through and reads the dataset until it runs out of data, then refreshes the metadata again, loops through the new data and so on.

Refreshing the metadata with dset.id.refresh() lags both the reader and writer processes for several seconds. This gets worse the more data that is refreshed (usually several gb at a time).

Please see the attached image for plots of the reading/writing rates to disk. The reading in this test case is slightly faster than the writing. Eventually the reader catches up to the writer and tries to call dset.id.refresh() on each loop iteration. At this point the I/O gets gridlocked and comes to a near standstill.

Thanks,
Eliseo

<PastedGraphic-1.tiff>

Thanks,
Eliseo
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

SWMR slow call to dset.id.refresh()