H5D_READ hanging an IDL process

Hello,

Has anyone seen a problem like this? We are writing data at 1KHz that is set up like this:

    HDF5 "201604121356.ovms.opd_estimation_mode2.h5" {
    DATASET "opd_estimation_mode2_01" {
        DATATYPE H5T_COMPOUND {
           H5T_STD_I64LE "time_stamp";
           H5T_STD_I32LE "tai_offset";
           H5T_IEEE_F32LE "sx_x_fp";
           H5T_IEEE_F32LE "sx_y_fp";
           H5T_IEEE_F32LE "dx_x_fp";
           H5T_IEEE_F32LE "dx_y_fp";
           H5T_IEEE_F32LE "diff_z_fp";
           H5T_IEEE_F32LE "predict_horiz";
        }

In a CentOS environment using hdf5 1.9.131 and IDL 1.8. We have scripts for reading the HDF5 files. Typically a python script identifies the .h5 files to be read, and then spawns an IDL process to read them and make a plot. Typically reading a 100M .h5 telemetry files take about 30 sec. The problem is that every 20th time or so, the IDL process is getting "stuck" and uses 100% CPU until we kill it. This only happens for IDL reading .h5 files, and never for the cases where the python script makes a large .csv which is read by IDL. It doesn't matter if the .h5 file is being actively written to, or if the file is weeks old.

When running from the IDL command line, to see the print statements, it seems to hang on the H5D_READ call. Sometimes it will hang repeatedly on the same data file, sometimes it works on the same file, without hanging.

I am going to try to attach to it with strace the next time I see it, but I was wondering if someone else has seen something similar?

Thanks

Hi Kellee,

There is not much information to suggest what could be going wrong. (And you are using a pretty old version of HDF5 trunk in the first place!)

A few questions that come to mind….
Would it be possible for you to try an officially released version of HDF5 (e.g., 1.8.16 or 1.10.0)? Can you reproduce the problem outside python script, i.e., just with IDL and HDF5? Or opening and reading the file without IDL, i.e., just python script and C program? Do you need to use thread-safe HDF5?

May be other people will have other suggestions for what to try...

Elena

···

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Apr 14, 2016, at 6:59 PM, Kellee Summers <ksummers@lbto.org<mailto:ksummers@lbto.org>> wrote:

Hello,

Has anyone seen a problem like this? We are writing data at 1KHz that is set up like this:
HDF5 "201604121356.ovms.opd_estimation_mode2.h5" {
DATASET "opd_estimation_mode2_01" {
   DATATYPE H5T_COMPOUND {
      H5T_STD_I64LE "time_stamp";
      H5T_STD_I32LE "tai_offset";
      H5T_IEEE_F32LE "sx_x_fp";
      H5T_IEEE_F32LE "sx_y_fp";
      H5T_IEEE_F32LE "dx_x_fp";
      H5T_IEEE_F32LE "dx_y_fp";
      H5T_IEEE_F32LE "diff_z_fp";
      H5T_IEEE_F32LE "predict_horiz";
   }
In a CentOS environment using hdf5 1.9.131 and IDL 1.8. We have scripts for reading the HDF5 files. Typically a python script identifies the .h5 files to be read, and then spawns an IDL process to read them and make a plot. Typically reading a 100M .h5 telemetry files take about 30 sec. The problem is that every 20th time or so, the IDL process is getting "stuck" and uses 100% CPU until we kill it. This only happens for IDL reading .h5 files, and never for the cases where the python script makes a large .csv which is read by IDL. It doesn't matter if the .h5 file is being actively written to, or if the file is weeks old.

When running from the IDL command line, to see the print statements, it seems to hang on the H5D_READ call. Sometimes it will hang repeatedly on the same data file, sometimes it works on the same file, without hanging.

I am going to try to attach to it with strace the next time I see it, but I was wondering if someone else has seen something similar?

Thanks
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Thank you, Elena. I was able to reproduce the hanging outside the python script, with just IDL and HDF5.

I strace'ed and found that it was in the IDL, FFT code. We did more research after that and found that IDL's FFT command takes several hundred times more CPU time for odd arrays than even. Example: x[719947] takes an about hour, while x[719948] takes about 30 sec.

More detail from: http://www.harrisgeospatial.com/docs/fft.html
"For a one-dimensional FFT, running time is roughly proportional to the total number of points in Array times the sum of its prime factors."

I think we are past the problem for now. Thank you,
Kellee

···

On 04/18/2016 10:00 AM, hdf-forum-request@lists.hdfgroup.org wrote:

Date: Sun, 17 Apr 2016 21:16:49 +0000
From: Elena Pourmal <epourmal@hdfgroup.org>
To: HDF Users Discussion List <hdf-forum@lists.hdfgroup.org>
Subject: Re: [Hdf-forum] H5D_READ hanging an IDL process
Message-ID: <552ADAF5-B4B4-4B4E-8357-D4976CC1390A@hdfgroup.org>
Content-Type: text/plain; charset="windows-1252"

Hi Kellee,

There is not much information to suggest what could be going wrong. (And you are using a pretty old version of HDF5 trunk in the first place!)

A few questions that come to mind?.
Would it be possible for you to try an officially released version of HDF5 (e.g., 1.8.16 or 1.10.0)? Can you reproduce the problem outside python script, i.e., just with IDL and HDF5? Or opening and reading the file without IDL, i.e., just python script and C program? Do you need to use thread-safe HDF5?

May be other people will have other suggestions for what to try...

Elena
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Elena Pourmal The HDF Group http://hdfgroup.org
1800 So. Oak St., Suite 203, Champaign IL 61820
217.531.6112
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Apr 14, 2016, at 6:59 PM, Kellee Summers <ksummers@lbto.org<mailto:ksummers@lbto.org>> wrote:

Hello,

Has anyone seen a problem like this? We are writing data at 1KHz that is set up like this:
HDF5 "201604121356.ovms.opd_estimation_mode2.h5" {
DATASET "opd_estimation_mode2_01" {
    DATATYPE H5T_COMPOUND {
       H5T_STD_I64LE "time_stamp";
       H5T_STD_I32LE "tai_offset";
       H5T_IEEE_F32LE "sx_x_fp";
       H5T_IEEE_F32LE "sx_y_fp";
       H5T_IEEE_F32LE "dx_x_fp";
       H5T_IEEE_F32LE "dx_y_fp";
       H5T_IEEE_F32LE "diff_z_fp";
       H5T_IEEE_F32LE "predict_horiz";
    }
In a CentOS environment using hdf5 1.9.131 and IDL 1.8. We have scripts for reading the HDF5 files. Typically a python script identifies the .h5 files to be read, and then spawns an IDL process to read them and make a plot. Typically reading a 100M .h5 telemetry files take about 30 sec. The problem is that every 20th time or so, the IDL process is getting "stuck" and uses 100% CPU until we kill it. This only happens for IDL reading .h5 files, and never for the cases where the python script makes a large .csv which is read by IDL. It doesn't matter if the .h5 file is being actively written to, or if the file is weeks old.

When running from the IDL command line, to see the print statements, it seems to hang on the H5D_READ call. Sometimes it will hang repeatedly on the same data file, sometimes it works on the same file, without hanging.

I am going to try to attach to it with strace the next time I see it, but I was wondering if someone else has seen something similar?

Thanks
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org<mailto:Hdf-forum@lists.hdfgroup.org>
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.hdfgroup.org/pipermail/hdf-forum_lists.hdfgroup.org/attachments/20160417/ea209d45/attachment-0001.html>

------------------------------

Subject: Digest Footer

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

------------------------------

End of Hdf-forum Digest, Vol 82, Issue 23
*****************************************