java: reading a 2D table

Hi Peter, thanks this makes a lot of sense. I just need to clarify
two more things:
(Q1)
I am trying to extract three columns from the same dataset (eg. Col1,
Col3, Col10) . Such as I can treat the series as three independent
vectors.
So far I have created 3 Datasets from the same table, each with its
own start/stride. But then I have to call dataset.reader() three
time. Which takes 3 times longer.
Is there any more efficient way to do it?

(Q2)
I have seen that both dataset.reader() and dataset.getData() take the
same time to run. Reading at the documentation I thought that one was
not putting data in memory, therefore I was expecting to be quicker.
Am I wrong? What if I have data bigger than my physical memory?

Jacopo

Hi Jacopo,

Jacopo Pecci wrote:

Hi Peter, thanks this makes a lot of sense. I just need to clarify
two more things:
(Q1)
I am trying to extract three columns from the same dataset (eg. Col1,
Col3, Col10) . Such as I can treat the series as three independent
vectors.
So far I have created 3 Datasets from the same table, each with its
own start/stride. But then I have to call dataset.reader() three
time. Which takes 3 times longer.
Is there any more efficient way to do it?
  

You can read three blocks using hyperslab selection. See more details on hyperslab selection, visit
http://www.hdfgroup.org/HDF5/doc/H5.intro.html#Intro-PMSelectHyper

However, the hdf-java object layer will not be able to do that. We try to stay the object layer
simple. You may have to use the hdf-java wrapper (H5 class) to make advanced selection.

(Q2)
I have seen that both dataset.reader() and dataset.getData() take the
same time to run. Reading at the documentation I thought that one was
not putting data in memory, therefore I was expecting to be quicker.
Am I wrong? What if I have data bigger than my physical memory?
  

dataset.read() will read the data from file. dataset.getData() will return the memory copy of the data.
If the data is not in memory, it will call dataset.read() to get the data from file and return the memory
copy of the data buffer.

dataset.clearData() cleans the memory data. To see the difference of dataset.read() and dataset.getData(),
try not to call dataset.clearData().

···

Jacopo

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org