Webinar Announcement: Tablite: 9BN rows/sec + HDF5 Support for all
October 26, 2022, 9:00 a.m. Central time US/Canada Register now
Tablite is an open source project which can be used for incremental data processing. Tablite uses HDF5 as a backend with strong abstraction, so that copy/append/repetition of data is handled in pages (this allows us to slice 9,000,000,000 rows in less than a second on localhost. Additional benefits of Tablite include the implementation of multiprocessing, respecting and addressing the limits of free memory, and using datatype mapping to native HDF5 types which in combination makes Tablite an elegant solution.
Just a reminder that we will be hosting Dr. Bjorn Madsen, Head of System Design Tools, Dematic to talk about Tablite tomorrow, Wednesday 10/26 at 9:00 a.m. Central time US/Canada. From the Tablite website:
Tablite uses HDF5 as a backend with strong abstraction, so that copy, append & repetition of data is handled in pages. This is imperative for incremental data processing.
Tablite tests for memory footprint. One test compares the memory footprint of 10,000,000 integers where tablite will use < 1 Mb RAM in contrast to python which will require around 133.7 Mb of RAM (1M lists with 10 integers). Tablite also tests to assure that working with 1Tb of data is tolerable.
Tablite achieves this by using HDF5 as storage which is faster than mmap’ed files for the average case [1, 2 ] and stores all data in /tmp/tablite.hdf5 so if your OS (windows/linux/mac) sits on a SSD it will benefit from high IOPS and permit slices of 9,000,000,000 rows in less than a second
On October 26, 2022, The HDF Group hosted Dr. Bjorn Badsen to discuss his project, Tablite.
Tablite is an open source project which can be used for incremental data processing. Tablite uses HDF5 as a backend with strong abstraction, so that copy/append/repetition of data is handled in pages (this allows us to slice 9,000,000,000 rows in less than a second on localhost. Additional benefits of Tablite include the implementation of multiprocessing, respecting and addressing the limits of free memory, and using datatype mapping to native HDF5 types which in combination makes Tablite an elegant solution.