parallel writing with chunks (fortran)

Hi all. I have a question about the parallel writing with data chunks.

As it is explained in the hdfgroup homepage( I tried the posted
example with different dimension size.

1. In the example, dimension of the total data is 4 by 8 (in fortran) and
there are 4 cores so each core shares 2 by 4 data chunks. If I increase 4
by 8 to 5 by 9, and assign corresponding data chunk size to each core
(core1: 2X4, core2: 3X4, core3: 2X5, core4: 3X5), it runs but writes out in
wrong format.

2. If I do not use "h5pset_chunk_f(plist_id, rank, chunk_dims, error)"
subroutine call, then it works fine with 5 by 9 size. It prints out the
result as I intended.

3. 4 by 8 example works also fine with datset created without 'plist_id'
from "h5pset_chunk_f(plist_id, rank, chunk_dims, error)" subroutine call.

My question is

1. To use "h5pset_chunk_f" subroutine, should the chunk size of each core
has to be the same?

2. If parallel writing works with and without "h5pset_chunk_f" subroutine,
what is the use of this subroutine?

Sorry for the basic question but I could not find any reference that
explains it clearly and easily.

Thanks in advance.