Speeding up HDF5 write operation

Hi,

Thanks for your help. I'll try and look deeper into the HDF5 code whenever I
can. For the time being I guess, I'll have to be content with binary and HDF5
being equivalent.

I'll email the group if I find anything worthwhile. Thanks again.

Regards,
Nikhil

Hi Nikhil,

> Hi,
>
> Yes, the selections in the file and memory dataspace are both the
> same rank and
> dimension. I am basically writing a very large vector, so rank
> always remains 1
> and I use the same dimension for both the memory and file dataspace.
>
> Is there something else I may need to check ?
>
> Can I find out where exactly the problem lies ?

  Hmm, I'm running out of obvious things to check. :-/ If you've got
the time & inclination, you could try digging into the HDF5 code more,
to pin down more precisely what's going on with your code's use of
HDF5. I might suggest using MPI's 'Jumpshot' tool to look at the MPI
communication going on, which might help reveal underlying issues.

  Sorry I can't be of more help here, but perhaps someone else in the
community has more time right now...

  Quincey

> Regards,
> Nikhil
>
>> Hi Nikhil,
>>
>>
>>> Hi,
>>>
>>> While doing paallel writes, if the size of the data being written by
>>> each
>>> processor is not the same, can it lead to the operation getting
>>> serialized by
>>> the MPI Implementation of HDF5 ?
>>
>> This probably shouldn't matter, the HDF5 library should just create
>> an MPI file view that incorporates the different sizes.
>>
>>> After looking at all possible reasons that may be slowing my write
>>> operation,
>>> I now think that this may be reason.
>>
>> Are the selections in the memory dataspaces you are using the same
>> rank and dimensions as the file dataspace selections?
>>
>> Quincey
>>
>>> Regards,
>>> Nikhil
>>>
>>>> Hi Nikhil,
>>>>
>>>>
>>>>> Hi,
>>>>>
>>>>> Sorry about that. Its attached this time.
>>>>
>>>> OK, I took a look at your section of code and although it's doing
>>>> parallel writes, they may be getting serialized somewhere under
>>>> HDF5
>>>> by the MPI implementation due to the [apparently] non-regular
>>>> pattern
>>>> you are writing. It's also very likely that you are writing too
>>>> small
>>>> of an amount of data to see much benefit from parallel I/O.
>>>>
>>>> Quincey
>>>>
>>>>
>>>>> Regards,
>>>>> NIkhil
>>>>>
>>>>>> Hi Nikhil,
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Thanks for your reply.
>>>>>>>
>>>>>>> I am attaching part of my code that does the parallel write.
>>>>>>> Points to notice are:
>>>>>>>
>>>>>>> 1. for 'nprocs' processors, there are 'nend' diagonal processors
>>>>>>> that are
>>>>>>> actually doing the write, where:
>>>>>>>
>>>>>>> nprocs = nend * (nend+1) / 2
>>>>>>>
>>>>>>> 2. the subroutine for parallel write, 'phdfwrite' is present in
>>>>>>> the
>>>>>>> file hdfmodule.f
>>>>>>>
>>>>>>> 3. This subroutine is called only by the diagonal
>>>>>>> processors(nend)
>>>>>>>
>>>>>>> Please find attached the source files.
>>>>>>
>>>>>> There was no attachment on your message.
>>>>>>
>>>>>> Quincey
>>>>>>
>>>>>>> I also notice that for 265875 real nos.,
>>>>>>> there is no speed difference even between INDEPENDENT and
>>>>>>> COLLECTIVE
>>>>>>> IO. Is this
>>>>>>> because of the small size of the array. Also do you find
>>>>>>> anything
>>>>>>> that I may be
>>>>>>> doing which reduces the speed ?
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Nikhil
>>>>>>>
>>>>>>>> Hi Nikhil,
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> I am writing a HDF5 file in parallel. But to my surprise, the
>>>>>>>>> performance of the
>>>>>>>>> parallel write isn't better compared to the serial binary
>>>>>>>>> write
>>>>>>>>> operation. To
>>>>>>>>> write 265875 real numbers, my HDF write takes about 0.1
>>>>>>>>> seconds
>>>>>>>>> whereas the
>>>>>>>>> serial binary operation takes around 0.07 seconds. This is
>>>>>>>>> surprising as
>>>>>>>>> parallel should be atleast as fast as serial if not any
>>>>>>>>> faster.
>>>>>>>>>
>>>>>>>>> Can anybody give me any suggestions as to what can be done to
>>>>>>>>> noticably speedup
>>>>>>>>> this write operation ?
>>>>>>>>
>>>>>>>> Hmm, are you using collective or independent parallel I/O?
>>>>>>>> Also,
>>>>>>>> that's a pretty small dataset, so you are not likely to see
>>>>>>>> much
>>>>>>>> difference either way.
>>>>>>>>
>>>>>>>>> Will the performance of HDF5 write be better than binary for
>>>>>>>>> very
>>>>>>>>> large arrays ?
>>>>>>>>
>>>>>>>> Our goal is to make HDF5 writes be equivalent to binary for
>>>>>>>> large
>>>>>>>> raw
>>>>>>>> data I/O operations, but to make the files produced self-
>>>>>>>> describing,
>>>>>>>> portable, etc. also.
>>>>>>>>
>>>>>>>>> If not how can I bring any substantial speedup ?
>>>>>>>>
>>>>>>>> This is a very hard question to answer without more
>>>>>>>> details... :slight_smile:
>>>>>>>>
>>>>>>>> Quincey
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nikhil
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>> This mailing list is for HDF software users discussion.
>>>>>>>>> To subscribe to this list, send a message to hdf-forum-
>>> subscribe@hdfgroup.org
>>>>>>>>> .
>>>>>>>>> To unsubscribe, send a message to hdf-forum-
>>>>>>>>> unsubscribe@hdfgroup.org.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>> This mailing list is for HDF software users discussion.
>>>>>>>> To subscribe to this list, send a message to hdf-forum-
>>> subscribe@hdfgroup.org
>>>>>>>> .
>>>>>>>> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nikhil
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----------------------------------------------------------------------
>>>>>>> This mailing list is for HDF software users discussion.
>>>>>>> To subscribe to this list, send a message to hdf-forum-
>>> subscribe@hdfgroup.org
>>>>>>> .
>>>>>>> To unsubscribe, send a message to hdf-forum-
>>>>>>> unsubscribe@hdfgroup.org.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>> Nikhil
>>>>> <
>>>>> parallel
>>>>> .f
>>>>>>
>>>>> <
>>>>> hdfmodule
>>>>> .f
>>>>>>
>>>>> ----------------------------------------------------------------------
>>>>> This mailing list is for HDF software users discussion.
>>>>> To subscribe to this list, send a message to hdf-forum-
>>> subscribe@hdfgroup.org
>>>>> .
>>>>> To unsubscribe, send a message to hdf-forum-
>>>>> unsubscribe@hdfgroup.org.
>>>>
>>>>
>>>> ----------------------------------------------------------------------
>>>> This mailing list is for HDF software users discussion.
>>>> To subscribe to this list, send a message to hdf-forum-
>>> subscribe@hdfgroup.org.
>>>> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org
>>>> .
>>>>
>>>
>>>
>>> Regards,
>>> Nikhil
>>>
>>>
>>>
>>
>>
>> ----------------------------------------------------------------------
>> This mailing list is for HDF software users discussion.
>> To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org
>> .
>> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.
>>
>
>
> Regards,
> Nikhil
>
>
>
> ----------------------------------------------------------------------
> This mailing list is for HDF software users discussion.
> To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org
> .
> To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.
>
>

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Regards,
Nikhil

···

On Aug 5, 2008, at 2:38 PM, Nikhil Laghave wrote:
>> On Aug 5, 2008, at 12:21 PM, Nikhil Laghave wrote:
>>>> On Jul 10, 2008, at 2:34 PM, Nikhil Laghave wrote:
>>>>>> On Jul 10, 2008, at 2:07 PM, Nikhil Laghave wrote:
>>>>>>>> On Jul 9, 2008, at 6:39 PM, Nikhil Laghave wrote:

A post was merged into an existing topic: Speeding up HDF5 write operation