I/O bandwidth drops dramatically and discontinuously for a large number of small datasets

Hsi-Yu_Schive · February 18, 2016, 8:36pm

I encounter a sudden drop of I/O bandwidth when the number of datasets in a
single group exceeds around 1.7 million. In the following I describe the
issue in more detail.

I'm converting an adaptive mesh refinement data to HDF5 format. Each
dataset contains a small 4-D array with a size of ~ 10 KB in the compact
format. All datasets are stored in the same group. When the total number of
datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100
MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the
bandwidth suddenly drops by at least one to two orders of magnitude.

This issue seems to relate to the **number of datasets per group** instead
of total data size. For example, if I reduce the size of each dataset by a
factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills drops when N >
~ 1.7 million, even though the total data size is reduced by a factor of 5.

So I was wondering what causes this issue, and if there is any simple
solution to that. Since the data stored in different datasets are
independent to each other, I prefer not to combine them into a larger
dataset. My current solution is to further create several HDF5 sub-groups
under the main group, and then distribute all datasets evenly in these
sub-groups (so that the number of datasets per group becomes smaller). By
doing so the I/O bandwidth becomes stable even when N > 1.7 million.

If necessary, I can post a simplified code to reproduce this issue.

Hsi-Yu

gheber · February 19, 2016, 2:26pm

Are you using the latest version of the file format? In other words, are you using H5P_DEFAULT (-> earliest)
as your file access property list, or have you created one which sets the library version bounds to H5F_LIBVER_18?

See https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds

In the newer version, groups with large numbers of links and attributes are managed more.

Does that solve your problem?

Best, G.

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Hsi-Yu Schive
Sent: Thursday, February 18, 2016 2:36 PM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] I/O bandwidth drops dramatically and discontinuously for a large number of small datasets

I encounter a sudden drop of I/O bandwidth when the number of datasets in a single group exceeds around 1.7 million. In the following I describe the issue in more detail.

I'm converting an adaptive mesh refinement data to HDF5 format. Each dataset contains a small 4-D array with a size of ~ 10 KB in the compact format. All datasets are stored in the same group. When the total number of datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100 MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the bandwidth suddenly drops by at least one to two orders of magnitude.

This issue seems to relate to the **number of datasets per group** instead of total data size. For example, if I reduce the size of each dataset by a factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills drops when N > ~ 1.7 million, even though the total data size is reduced by a factor of 5.

So I was wondering what causes this issue, and if there is any simple solution to that. Since the data stored in different datasets are independent to each other, I prefer not to combine them into a larger dataset. My current solution is to further create several HDF5 sub-groups under the main group, and then distribute all datasets evenly in these sub-groups (so that the number of datasets per group becomes smaller). By doing so the I/O bandwidth becomes stable even when N > 1.7 million.

If necessary, I can post a simplified code to reproduce this issue.

Hsi-Yu

Hsi-Yu_Schive · February 19, 2016, 10:03pm

Thanks for the suggestion. The performance I reported was measured using
the earliest file format (i.e., H5F_LIBVER_EARLIEST). I just tried to use
H5F_LIBVER_18, but it leads to an even worse performance. The bandwidth
starts to drop when N > ~ 0.5 million. Using H5F_LIBVER_LATEST does not
help either.

Justin

···

2016-02-19 8:26 GMT-06:00 Gerd Heber <gheber@hdfgroup.org>:

Are you using the latest version of the file format? In other words, are
you using H5P_DEFAULT (-> earliest)

as your file access property list, or have you created one which sets the
library version bounds to H5F_LIBVER_18?

See
https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds

In the newer version, groups with large numbers of links and attributes
are managed more.

Does that solve your problem?

Best, G.

*From:* Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] *On
Behalf Of *Hsi-Yu Schive
*Sent:* Thursday, February 18, 2016 2:36 PM
*To:* hdf-forum@lists.hdfgroup.org
*Subject:* [Hdf-forum] I/O bandwidth drops dramatically and
discontinuously for a large number of small datasets

I encounter a sudden drop of I/O bandwidth when the number of datasets in
a single group exceeds around 1.7 million. In the following I describe the
issue in more detail.

I'm converting an adaptive mesh refinement data to HDF5 format. Each
dataset contains a small 4-D array with a size of ~ 10 KB in the compact
format. All datasets are stored in the same group. When the total number of
datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100
MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the
bandwidth suddenly drops by at least one to two orders of magnitude.

This issue seems to relate to the **number of datasets per group** instead
of total data size. For example, if I reduce the size of each dataset by a
factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills drops when N >
~ 1.7 million, even though the total data size is reduced by a factor of 5.

So I was wondering what causes this issue, and if there is any simple
solution to that. Since the data stored in different datasets are
independent to each other, I prefer not to combine them into a larger
dataset. My current solution is to further create several HDF5 sub-groups
under the main group, and then distribute all datasets evenly in these
sub-groups (so that the number of datasets per group becomes smaller). By
doing so the I/O bandwidth becomes stable even when N > 1.7 million.

If necessary, I can post a simplified code to reproduce this issue.

Hsi-Yu

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

epourmal · February 19, 2016, 11:41pm

Justin,

Will it be possible for you to provide a program that illustrates the problem? Which version of the library are you using? On which system are you running your application?

Thank you!

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Feb 19, 2016, at 4:03 PM, Hsi-Yu Schive <hyschive@gmail.com<mailto:hyschive@gmail.com>> wrote:

Thanks for the suggestion. The performance I reported was measured using the earliest file format (i.e., H5F_LIBVER_EARLIEST). I just tried to use H5F_LIBVER_18, but it leads to an even worse performance. The bandwidth starts to drop when N > ~ 0.5 million. Using H5F_LIBVER_LATEST does not help either.

Justin

2016-02-19 8:26 GMT-06:00 Gerd Heber <gheber@hdfgroup.org<mailto:gheber@hdfgroup.org>>:
Are you using the latest version of the file format? In other words, are you using H5P_DEFAULT (-> earliest)
as your file access property list, or have you created one which sets the library version bounds to H5F_LIBVER_18?

See https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds

In the newer version, groups with large numbers of links and attributes are managed more.

Does that solve your problem?

Best, G.

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>] On Behalf Of Hsi-Yu Schive
Sent: Thursday, February 18, 2016 2:36 PM
To: hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>
Subject: [Hdf-forum] I/O bandwidth drops dramatically and discontinuously for a large number of small datasets

I encounter a sudden drop of I/O bandwidth when the number of datasets in a single group exceeds around 1.7 million. In the following I describe the issue in more detail.

I'm converting an adaptive mesh refinement data to HDF5 format. Each dataset contains a small 4-D array with a size of ~ 10 KB in the compact format. All datasets are stored in the same group. When the total number of datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100 MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the bandwidth suddenly drops by at least one to two orders of magnitude.

This issue seems to relate to the **number of datasets per group** instead of total data size. For example, if I reduce the size of each dataset by a factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills drops when N > ~ 1.7 million, even though the total data size is reduced by a factor of 5.

So I was wondering what causes this issue, and if there is any simple solution to that. Since the data stored in different datasets are independent to each other, I prefer not to combine them into a larger dataset. My current solution is to further create several HDF5 sub-groups under the main group, and then distribute all datasets evenly in these sub-groups (so that the number of datasets per group becomes smaller). By doing so the I/O bandwidth becomes stable even when N > 1.7 million.

If necessary, I can post a simplified code to reproduce this issue.

Hsi-Yu

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Hsi-Yu_Schive · February 21, 2016, 11:05pm

Hi Elena,

A simple code demonstrating this issue is attached. Please try to modify
the variables "NGroup, LibVerLow, LibVerLow". NGroup gives the number of
groups for a fixed number of datasets (NDataset), and the other two
variables specify the file format. The size of each dataset is ~2 KB.

I tried four different cases, with the combination of NGroup=1 or 128 and
LibVerLow=H5F_LIBVER_EARLIEST or H5F_LIBVER_18. For NGroup=1, the I/O
bandwidth drops dramatically when the file size exceeds ~ 3.4 GB. For
NGroup=128, the bandwidth becomes reasonable. The results are similar for
different LibVerLow (actually the results are a bit worse for H5F_LIBVER_18
and H5F_LIBVER_LATEST than for H5F_LIBVER_EARLIEST ).

Some system spec:
HDF5 version: 1.8.16
CPU: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
File system: gpfs
OS: CentOS release 6.7

Sincerely,
Justin

HDF5_IO_Bandwidth__Justin.cpp (5.1 KB)

···

2016-02-19 17:41 GMT-06:00 Elena Pourmal <epourmal@hdfgroup.org>:

Justin,

Will it be possible for you to provide a program that illustrates the
problem? Which version of the library are you using? On which system are
you running your application?

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Feb 19, 2016, at 4:03 PM, Hsi-Yu Schive <hyschive@gmail.com> wrote:

Thanks for the suggestion. The performance I reported was measured using
the earliest file format (i.e., H5F_LIBVER_EARLIEST). I just tried to use
H5F_LIBVER_18, but it leads to an even worse performance. The bandwidth
starts to drop when N > ~ 0.5 million. Using H5F_LIBVER_LATEST does not
help either.

Justin

2016-02-19 8:26 GMT-06:00 Gerd Heber <gheber@hdfgroup.org>:

Are you using the latest version of the file format? In other words, are
you using H5P_DEFAULT (-> earliest)

as your file access property list, or have you created one which sets the
library version bounds to H5F_LIBVER_18?

See
https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds

In the newer version, groups with large numbers of links and attributes
are managed more.

Does that solve your problem?

Best, G.

*From:* Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] *On
Behalf Of *Hsi-Yu Schive
*Sent:* Thursday, February 18, 2016 2:36 PM
*To:* hdf-forum@lists.hdfgroup.org
*Subject:* [Hdf-forum] I/O bandwidth drops dramatically and
discontinuously for a large number of small datasets

I encounter a sudden drop of I/O bandwidth when the number of datasets in
a single group exceeds around 1.7 million. In the following I describe the
issue in more detail.

I'm converting an adaptive mesh refinement data to HDF5 format. Each
dataset contains a small 4-D array with a size of ~ 10 KB in the compact
format. All datasets are stored in the same group. When the total number of
datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100
MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the
bandwidth suddenly drops by at least one to two orders of magnitude.

This issue seems to relate to the **number of datasets per group**
instead of total data size. For example, if I reduce the size of each
dataset by a factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills
drops when N > ~ 1.7 million, even though the total data size is reduced by
a factor of 5.

So I was wondering what causes this issue, and if there is any simple
solution to that. Since the data stored in different datasets are
independent to each other, I prefer not to combine them into a larger
dataset. My current solution is to further create several HDF5 sub-groups
under the main group, and then distribute all datasets evenly in these
sub-groups (so that the number of datasets per group becomes smaller). By
doing so the I/O bandwidth becomes stable even when N > 1.7 million.

If necessary, I can post a simplified code to reproduce this issue.

Hsi-Yu

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

epourmal · February 21, 2016, 11:54pm

Hi Justin,

Thanks a lot for the program! We will take a look.

Just one more question. Have you tried to run your benchmark on some other file system?

Thanks again!

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Feb 21, 2016, at 5:05 PM, Hsi-Yu Schive <hyschive@gmail.com<mailto:hyschive@gmail.com>> wrote:

Hi Elena,

A simple code demonstrating this issue is attached. Please try to modify the variables "NGroup, LibVerLow, LibVerLow". NGroup gives the number of groups for a fixed number of datasets (NDataset), and the other two variables specify the file format. The size of each dataset is ~2 KB.

I tried four different cases, with the combination of NGroup=1 or 128 and LibVerLow=H5F_LIBVER_EARLIEST or H5F_LIBVER_18. For NGroup=1, the I/O bandwidth drops dramatically when the file size exceeds ~ 3.4 GB. For NGroup=128, the bandwidth becomes reasonable. The results are similar for different LibVerLow (actually the results are a bit worse for H5F_LIBVER_18 and H5F_LIBVER_LATEST than for H5F_LIBVER_EARLIEST ).

Some system spec:
HDF5 version: 1.8.16
CPU: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
File system: gpfs
OS: CentOS release 6.7

Sincerely,
Justin

2016-02-19 17:41 GMT-06:00 Elena Pourmal <epourmal@hdfgroup.org<mailto:epourmal@hdfgroup.org>>:
Justin,

Will it be possible for you to provide a program that illustrates the problem? Which version of the library are you using? On which system are you running your application?

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org<http://hdfgroup.org/>
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112<tel:217.531.6112>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Feb 19, 2016, at 4:03 PM, Hsi-Yu Schive <hyschive@gmail.com<mailto:hyschive@gmail.com>> wrote:

Thanks for the suggestion. The performance I reported was measured using the earliest file format (i.e., H5F_LIBVER_EARLIEST). I just tried to use H5F_LIBVER_18, but it leads to an even worse performance. The bandwidth starts to drop when N > ~ 0.5 million. Using H5F_LIBVER_LATEST does not help either.

Justin

2016-02-19 8:26 GMT-06:00 Gerd Heber <gheber@hdfgroup.org<mailto:gheber@hdfgroup.org>>:
Are you using the latest version of the file format? In other words, are you using H5P_DEFAULT (-> earliest)
as your file access property list, or have you created one which sets the library version bounds to H5F_LIBVER_18?

See https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds

In the newer version, groups with large numbers of links and attributes are managed more.

Does that solve your problem?

Best, G.

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>] On Behalf Of Hsi-Yu Schive
Sent: Thursday, February 18, 2016 2:36 PM
To: hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>
Subject: [Hdf-forum] I/O bandwidth drops dramatically and discontinuously for a large number of small datasets

I encounter a sudden drop of I/O bandwidth when the number of datasets in a single group exceeds around 1.7 million. In the following I describe the issue in more detail.

I'm converting an adaptive mesh refinement data to HDF5 format. Each dataset contains a small 4-D array with a size of ~ 10 KB in the compact format. All datasets are stored in the same group. When the total number of datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100 MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the bandwidth suddenly drops by at least one to two orders of magnitude.

This issue seems to relate to the **number of datasets per group** instead of total data size. For example, if I reduce the size of each dataset by a factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills drops when N > ~ 1.7 million, even though the total data size is reduced by a factor of 5.

So I was wondering what causes this issue, and if there is any simple solution to that. Since the data stored in different datasets are independent to each other, I prefer not to combine them into a larger dataset. My current solution is to further create several HDF5 sub-groups under the main group, and then distribute all datasets evenly in these sub-groups (so that the number of datasets per group becomes smaller). By doing so the I/O bandwidth becomes stable even when N > 1.7 million.

If necessary, I can post a simplified code to reproduce this issue.

Hsi-Yu

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

<HDF5_IO_Bandwidth__Justin.cpp>_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Hsi-Yu_Schive · February 22, 2016, 2:30am

Hi Elena,

I just tried it on a local system with the XFS file system. The same issue
happens for H5F_LIBVER_EARLIEST, but for both H5F_LIBVER_18 and
H5F_LIBVER_LATEST
the bandwidth becomes stable (although still lower than the case with
NGROUP=128 by a factor of 1.5 ~ 2). Please let me know if you could
reproduce these results. Thanks!

Justin

···

2016-02-21 17:54 GMT-06:00 Elena Pourmal <epourmal@hdfgroup.org>:

Hi Justin,

Thanks a lot for the program! We will take a look.

Just one more question. Have you tried to run your benchmark on some other
file system?

Thanks again!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Feb 21, 2016, at 5:05 PM, Hsi-Yu Schive <hyschive@gmail.com> wrote:

Hi Elena,

A simple code demonstrating this issue is attached. Please try to modify
the variables "NGroup, LibVerLow, LibVerLow". NGroup gives the number of
groups for a fixed number of datasets (NDataset), and the other two
variables specify the file format. The size of each dataset is ~2 KB.

I tried four different cases, with the combination of NGroup=1 or 128 and
LibVerLow=H5F_LIBVER_EARLIEST or H5F_LIBVER_18. For NGroup=1, the I/O
bandwidth drops dramatically when the file size exceeds ~ 3.4 GB. For
NGroup=128, the bandwidth becomes reasonable. The results are similar for
different LibVerLow (actually the results are a bit worse for H5F_LIBVER_18
and H5F_LIBVER_LATEST than for H5F_LIBVER_EARLIEST ).

Some system spec:
HDF5 version: 1.8.16
CPU: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
File system: gpfs
OS: CentOS release 6.7

Sincerely,
Justin

2016-02-19 17:41 GMT-06:00 Elena Pourmal <epourmal@hdfgroup.org>:

Justin,

Will it be possible for you to provide a program that illustrates the
problem? Which version of the library are you using? On which system are
you running your application?

Thank you!

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Feb 19, 2016, at 4:03 PM, Hsi-Yu Schive <hyschive@gmail.com> wrote:

Thanks for the suggestion. The performance I reported was measured using
the earliest file format (i.e., H5F_LIBVER_EARLIEST). I just tried to use
H5F_LIBVER_18, but it leads to an even worse performance. The bandwidth
starts to drop when N > ~ 0.5 million. Using H5F_LIBVER_LATEST does not
help either.

Justin

2016-02-19 8:26 GMT-06:00 Gerd Heber <gheber@hdfgroup.org>:

Are you using the latest version of the file format? In other words, are
you using H5P_DEFAULT (-> earliest)

as your file access property list, or have you created one which sets
the library version bounds to H5F_LIBVER_18?

See
https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds

In the newer version, groups with large numbers of links and attributes
are managed more.

Does that solve your problem?

Best, G.

*From:* Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] *On
Behalf Of *Hsi-Yu Schive
*Sent:* Thursday, February 18, 2016 2:36 PM
*To:* hdf-forum@lists.hdfgroup.org
*Subject:* [Hdf-forum] I/O bandwidth drops dramatically and
discontinuously for a large number of small datasets

I encounter a sudden drop of I/O bandwidth when the number of datasets
in a single group exceeds around 1.7 million. In the following I describe
the issue in more detail.

I'm converting an adaptive mesh refinement data to HDF5 format. Each
dataset contains a small 4-D array with a size of ~ 10 KB in the compact
format. All datasets are stored in the same group. When the total number of
datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100
MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the
bandwidth suddenly drops by at least one to two orders of magnitude.

This issue seems to relate to the **number of datasets per group**
instead of total data size. For example, if I reduce the size of each
dataset by a factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills
drops when N > ~ 1.7 million, even though the total data size is reduced by
a factor of 5.

So I was wondering what causes this issue, and if there is any simple
solution to that. Since the data stored in different datasets are
independent to each other, I prefer not to combine them into a larger
dataset. My current solution is to further create several HDF5 sub-groups
under the main group, and then distribute all datasets evenly in these
sub-groups (so that the number of datasets per group becomes smaller). By
doing so the I/O bandwidth becomes stable even when N > 1.7 million.

If necessary, I can post a simplified code to reproduce this issue.

Hsi-Yu

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

<HDF5_IO_Bandwidth__Justin.cpp>
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: x.com

Attention! https://support.hdfgroup.org is the NEW home for documentation from The HDF Group. (Details)

I/O bandwidth drops dramatically and discontinuously for a large number of small datasets