I could upload my the data on the server with the proposed way.
Now I’m looking for the solution of my problem described in other topic.
I thought that HSDS approach could improve perfomance but with the opportunities provided by the Kita Lab (4 nodes) it took 250 seconds to read 500 arbitrary rows (on my personal PC I read 5000 rows in 90 seconds as posted in the question, probably this is limited by my HDD disk). But the goal is to do that in about 1 second if possible.
Does HSDS spread the data among multiple nodes so I can expect that reading selected data may be faster than speed limits of a single HDD disk?
Is there any preffered way to solve my task?
For now I get an exception when trying to read each 5000th row from 25 millions rows data:
arr=dset[0:25000000:5000, :]
timeout Traceback (most recent call last)
OSError: Error retrieving data: None