HDFql version 2.5.0

contact · September 5, 2023, 5:19am

We are happy to announce the release of HDFql 2.5.0!

This version includes:

Added support for sliding cursors (to enable reading a dataset that does not fit in (RAM) memory in a sliding fashion through a cursor, allowing a user to (seamlessly) load/process the dataset in an out-of-core manner)
Added support to create a dataset/attribute based on the characteristics (i.e. data type and dimensions) of the input redirecting (e.g. when executing “CREATE DATASET my_dataset VALUES FROM BINARY FILE my_file.bin”, a dataset named “my_dataset” is created with the appropriate data type and dimensions to store all the data from a binary file named “my_file.bin”, alleviating the user from specifying these)
Improved performance and memory footprint of a cursor populated with values from datasets/attributes (thanks to a zero-copy policy which reutilizes the buffer used to read these - e.g. given a dataset of data type INT of three dimensions (size 100x1024x1024), it is 10x faster and takes 15x less memory to populate a cursor with values from the dataset in comparison with the previous version of HDFql)
Added support to write a result set into a dataset/attribute (e.g. when executing “SHOW FILE INTO DATASET my_dataset”, a dataset named “my_dataset” is created (if it does not exist) with the appropriate data type and dimensions to store all the names of files found in the directory currently in use)

(Please check the release notes for further details and some examples that illustrate how HDFql works)