Reading large compund datatype dataset using chunking and hyperslab concept

er_akhilesh15 · October 19, 2022, 9:24am

Hi all,

I am trying to read a big dataset which is of a single dimensional compound datatype with the dimension size like 2073550 (or more in different dataset), I have found that reading large datasets using hyperslab selection is increasing the performance of the code in case of large datasets.
I am not just reading the dataset i am looking up a certain row in the dataset which contains the data i need for the output, this data can be present at any end of the file so I made a loop of rearranging the selection of filespace and in memory dataspace with increasing the chunk offset with chunk size of the dataset.
I run into the error when the selection becomes extended than the original dimension of the dataspace so I checked that in my code so that all the selection will be inside the dimension size by reducing the chunk_count
based on current offset position by doing that I thought I will read till end of dataset.
still I am not able to read the last few rows of the dataset but the selection is not going till the end of the dataset.

Can any one help me with that logic?

contact · October 19, 2022, 5:27pm

Hi @er_akhilesh15,

Would you mind to post a minimal reproducible example (MRE) of your logic so that we can better assist you?