Sorting huge data sets

Hi All,
     I would like to know what would be the best approach to sorting huge data sets within hdf5 file. Most of the data is unsorted in this case !!

Thanks & Regards

Kailash K

</PRE><p style="font-family:arial;color:grey" style="font-size:13px">This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.</p><PRE>

A Thursday 13 January 2011 05:20:48 Kavalakuntla, Kailashnath escrigué:

Hi All,
     I would like to know what would be the best approach to sorting
huge data sets within hdf5 file. Most of the data is unsorted in
this case !!

If you datasets are huge enough, for implementing this sort process you
will need to implement an out-of-core sorting algorithm. See this
answer for a similar question like yours:

http://bit.ly/h2UNnW

<shameless plug>

PyTables Pro has support for sorting unlimited length tables (HDF5
datasets made of compound types) by just specifying the column to sort
in the `sortby` parameter in the `Table.copy()` method.

</shameless plug>

Hope this helps,

···

--
Francesc Alted