Ensuring atomicity for database transactions (ACID properties)


#1

Hi all,

My team and I are looking to develop a time-series database (TSDB) using the HDF5 file format. However, an issue we have found is that of the Atomicity (one of the ACID properties of database transactions) required for writing to the database. Another possible interest, InfluxDB, solves this issue with a Write-Ahead Log (WAL).

I wanted to know if there are any other best practices to use, other than the Single Writer Multiple Reader (SWMR) functionality in HDF5, to ensure Atomicity for database transactions?

Kind regards,

Brett


#2

Brett, how are you? Maybe a quick clarification:

  1. HDF5 is not a database. (OK, a “data base” in a very broad sense.)
  2. There is no concept of a transaction in HDF5.

HDF5 I/O operations (e.g., H5D[read,write]) are atomic in the sense that (in a correct program) you’ll never encounter partially written or corrupt values. That’s true in SWMR mode with the caveat that data may not be current, but it will be so eventually.

Best, G.


#3

Hi there G,

I hope all is well.

Just to clarify for everyone:

  1. I am aware that HDF5 is only a file format, however, as stated in my first message on this thread, my team and I are wanting to develop a time-series database using the HDF5 file format.
  1. There is no concept of a transaction in HDF5, however, when building the above-mentioned time-series database my team and I need to ensure that ACID database properties can be achieved.

With respect to your statement about I/O operations in HDF5, could you elaborate on “In a correct program”. That is what we are after, ensuring that we can develop such a database with inherently required database properties (ACID).

Just to clarify, my team and I are wanting to stream real time data into an HDF5 file which, using the SWMR functionality, can be treated as a time-series database. We understand that the HDF5 file format is not a database, however, we want to use these files as building blocks for a time-series database instead of moving the data from the files to a dedicated database. The data will be continuously streamed into the database for a period no less than one year (as long as is required from the client).

Kind regards,

Brett.


#4

HDF5 is not only that, but that’s for another day. :sunglasses:

By a “correct program,” I mean one that runs in a supported environment (OS, compiler, file system, runtime, etc.), one that doesn’t use undocumented or unsupported behavior and one that doesn’t contain race conditions (in concurrency settings, such as MPI). I think the ACID-ity hinges on your concept of “transaction.” It’s beyond the scope of HDF5 and must come from your transaction manager (implementation).

Best, G.


#5

Hi Gerd,

I hope all is well.

I have developed a test program in C# to simulate the corruption of a file if a fatal error occurs before closing all open HDF5 files, groups etc.

In order to execute this test, perform the following steps:

  1. Download HDF5_Simple_Writer.zip (3.5 MB) and unzip the folder.

  2. Run the C# solution, restore the nuget packages (PInvoke library), and rebuild the solution.

  3. Read through the solution if needed but note: The boolean crash on line 22 determines whether you would like to run a normal execution (false) or if you want to simulate a fatal crash (true).

  4. First set the boolean crash to false to execute the process as normal. An HDF5 file entitled TestFile.h5 should have been created in the same directory as the solution. Open the file to see the test data set written to a test group within the file.

  5. Thereafter, set the boolean crash to true to simulate a fatal crash (power loss to the machine). Set a break point on line 101 (Console.WriteLine();). Start the program and when this break point is hit, simply stop the program.

You should then see the following error when trying to open the file:

HDF5_Error

The fatal crashing of the program we are developing is something we are trying to safeguard against. If the HDF5 file is corrupted by such a crash what are the procedures we can put in place (apart from using a UPS on the machine) to ensure that this file corruption does not happen?

If this corruption is unavoidable, what measures can be taken to retrieve data within the corrupted file?


#6

That’s not entirely correct. There’s nothing than prevents the library from corrupting a file.

Στις Δευ, 20 Απρ 2020, 16:02 ο χρήστης Gerd Heber noreply@forum.hdfgroup.org έγραψε:


#7

Brett, thanks for the example. It works! :sunglasses:

It is “unavoidable” in the sense that recovery or to guard against abnormal program termination isn’t a requirement of the current implementation.

In production terms (“what measures…”), none. What, if any, data there might be is uncertain, and, if there is any data, it could be inconsistent or uninterpretable.

Best, G.


#8

Hi Gerd,

Thanks a lot for your responses. They have been extremely useful.

Kind regards,

Brett.


#9

I also would love to have a scientific format that is transactional (with ACID changes). I don’t know how orthogonal it is with high throughput performance, any thoughts?
Like Brett shows in his example, it is actually pretty easy to corrupt an HDF5 file and that makes it a not so good fit for real-time datasets (e.g datasets that can not be replayed, like data acquisitions).


#10

Just to be pedantic: There’s nothing “transactional” about a format such as the HDF5 file format or the MP3 format, or any other format. Can one build software around a format that supports transactions? Perhaps. (And, yes, that might be technically a little easier for some formats than for others.) The basic questions are: What’s the scope? How much are you willing to pay in budget and loss in performance?

That’s a bit like saying that Brett just discovered that glass bottles are fragile and that makes them a “not so good fit” for liquids (such as milk or beer!). Just kidding. Can we improve HDF5’s “exceptional behavior”? Absolutely! But then again, what’re the scope and budget?

G.


#11

I think the point is not so much the format, but rather the library. Although the format can facilitate the implementation of ACID properties by a library.

Neither the format and for sure nor the library make any provisions to ensure that eg something is either available or not and if there’s a failure the previous state is intact. The smallest IO failure results in existing data being unreadable and a random mix of new metadata and raw data may exist in the file. Mind you, this issue is so grossly neglected by design that if present it would have zero impact on performance.

The only way to make an HDF5 based application somehow more sustainable for streaming and high throughput is to write hdf5 files from streams and have a process to merge them. I have a memory driver for this that can copy hdf5 objects from memory buffered files to disk files and then a process merges them


#12

That’s a bit like saying that Brett just discovered that glass bottles are fragile and that makes them a “not so good fit” for liquids (such as milk or beer!).

I would rather put my liquid gold data in a titanium bottle :slight_smile:. And I would not compare HDF5 to MP3 that is pretty good at fault tolerance and error robustness (not with transaction but that is another story).

My question was more if the HDFGroup has some experience with a transactional VFD, say SQLite based. Looking at the SQLite doc about atomicity, it’s pretty clear that there would be some overhead. I don’t know the price I would pay for ACID, maybe 30% loss in performance? Data integrity is probably our top priority.

Thanks Dimitris for sharing your workaround. I agree that an intermediate storage is needed (not to put all our eggs in one basket) then merge or use a Virtual Dataset.


#13

I think the way e.g. high frequency trading applications work offers a good paradigm, data lives in systems like Kafka in a predominantly column major noSQL form, processed and later put in data lakes or slower storage in row major SQL format.


#14

Just to throw this out…HDF Server (https://github.com/HDFGroup/hsds) does support atomicity, so that maybe worth looking into.

With the HDF5 library and posix files ensuring that the file never gets corrupted is quite complicated since the library is maintaining internal data structures that can get messed up if the application crashes at the wrong moment.

By contrast HDF Server is relying on the object storage system to keep things consistent. For example adding a link to a group the server either updates the json object with the new link or fails (and you can retry). There’s no partial or interrupted write case to deal with.