Difference between Independent and Collective IO

Nikhil_Laghave1 · August 7, 2008, 9:20pm

Hello,

I have a small doubt. I have been looking for a proper explanation on google but
did not find anything.

The documents say that Independent IO means that it is not Collective. So does
this mean that only the root processor is involved in the IO or does this mean
that the processors in the specified communicator do not participate
collectively but take turns instead ?

I am sorry, this is just a curiosity question.

Regards,
Nikhil

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Quincey_Koziol · August 7, 2008, 10:25pm

Hi Nikhil,

Hello,

I have a small doubt. I have been looking for a proper explanation on google but
did not find anything.

The documents say that Independent IO means that it is not Collective. So does
this mean that only the root processor is involved in the IO or does this mean
that the processors in the specified communicator do not participate
collectively but take turns instead ?

Independent IO is performed only by the processor that makes the API call (not necessarily the root processor).

Quincey

···

On Aug 7, 2008, at 4:20 PM, Nikhil Laghave wrote:

I am sorry, this is just a curiosity question.

Regards,
Nikhil

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Zhengying_Wang · August 8, 2008, 9:00am

Hello,

I just wrote a test program to test the compression performance of hdf
file.
There are 2 datasets in the test file. The definition of the dataset is
as follows,
Dataset1
{
  unsigned long long item1;
  unsigned int item2;
  int item3;
  int item4;
  unsigned long long item5;
};

Dataset2
{
int item1;
double item2;
};

To the test file, there are 62847260 records to Dataset1, and 831136075
records to Dataset2. Also the file is chunked with size 262144 and
compressed with ratio 9. The file size is 200204605 bytes.

In theory, the struct size of Dataset1 is 28 and 12 to Dataset2. The
size of the datasets should be:

62847260*28 + 831136075*12 = 3143421588

The compression ratio seems to be just 1.57?

Any ideas what's going on here? How will compound datatype affect the
compression performance?

Any help would be appreciated.

Thanks a lot,
Zane

···

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Quincey_Koziol · August 8, 2008, 12:35pm

Hi Zane,

Hello,

I just wrote a test program to test the compression performance of hdf
file.
There are 2 datasets in the test file. The definition of the dataset is
as follows,
Dataset1
{
unsigned long long item1;
unsigned int item2;
int item3;
int item4;
unsigned long long item5;
};

Dataset2
{
int item1;
double item2;
};

To the test file, there are 62847260 records to Dataset1, and 831136075
records to Dataset2. Also the file is chunked with size 262144 and
compressed with ratio 9. The file size is 200204605 bytes.

In theory, the struct size of Dataset1 is 28 and 12 to Dataset2. The
size of the datasets should be:

62847260*28 + 831136075*12 = 3143421588

The compression ratio seems to be just 1.57?

Any ideas what's going on here? How will compound datatype affect the
compression performance?

Try with the shuffling filter before the deflate filter. That should put "similar" parts of the data near each other and improve the compression ratio.

Quincey

···

On Aug 8, 2008, at 4:00 AM, Zhengying Wang wrote:

Any help would be appreciated.

Thanks a lot,
Zane

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.

Francesc_Alted1 · August 8, 2008, 10:53pm

A Friday 08 August 2008, Quincey Koziol escrigué:

Hi Zane,

> Hello,
>
> I just wrote a test program to test the compression performance of
> hdf file.
> There are 2 datasets in the test file. The definition of the
> dataset is
> as follows,
> Dataset1
> {
> unsigned long long item1;
> unsigned int item2;
> int item3;
> int item4;
> unsigned long long item5;
> };
>
> Dataset2
> {
> int item1;
> double item2;
> };
>
> To the test file, there are 62847260 records to Dataset1, and
> 831136075
> records to Dataset2. Also the file is chunked with size 262144 and
> compressed with ratio 9. The file size is 200204605 bytes.
>
> In theory, the struct size of Dataset1 is 28 and 12 to Dataset2.
> The size of the datasets should be:
>
> 62847260*28 + 831136075*12 = 3143421588
>
> The compression ratio seems to be just 1.57?
>
> Any ideas what's going on here? How will compound datatype affect
> the compression performance?

Try with the shuffling filter before the deflate filter. That
should put "similar" parts of the data near each other and improve
the compression ratio.

Besides applying the shuffle filter, you may get better compression by
increasing the chunksize of your datasets (i.e. by giving the
compressor more opportunities to find data redundancies). But using
shuffling should be your first try, indeed.

Cheers,

···

On Aug 8, 2008, at 4:00 AM, Zhengying Wang wrote:

--
Francesc Alted
Freelance developer
Tel +34-964-282-249

----------------------------------------------------------------------
This mailing list is for HDF software users discussion.
To subscribe to this list, send a message to hdf-forum-subscribe@hdfgroup.org.
To unsubscribe, send a message to hdf-forum-unsubscribe@hdfgroup.org.