Commit and rollback behaviour during power loss

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Commit and rollback behaviour during power loss

Blagovest Buyukliev
Let's say we have the following Python code:

import sqlite3
conn = sqlite3.connect('mydb.db')
c = conn.cursor()
c.execute("INSERT INTO ...")
conn.commit()
c.execute("INSERT INTO ...")
conn.commit()

Can it be assumed that after conn.commit() has returned, fsync() has
been called on the file and (assuming that fsync doesn't lie and
doesn't just act as a barrier), the last inserted row has made its way
to disk?

Also, say that power is lost during the *second* conn.commit(). Is
there a possibility that the database will be rolled back to the state
before the first INSERT, or is it guaranteed that it will be rolled
back to the state before the second INSERT?
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Commit and rollback behaviour during power loss

R Smith


On 2017/11/24 10:47 AM, Blagovest Buyukliev wrote:

> Let's say we have the following Python code:
>
> import sqlite3
> conn = sqlite3.connect('mydb.db')
> c = conn.cursor()
> c.execute("INSERT INTO ...")
> conn.commit()
> c.execute("INSERT INTO ...")
> conn.commit()
>
> Can it be assumed that after conn.commit() has returned, fsync() has
> been called on the file and (assuming that fsync doesn't lie and
> doesn't just act as a barrier), the last inserted row has made its way
> to disk?

Probably not[1], but lets assume for a moment a transaction was started
successfully, then:

If Commit returns with SQLITE_OK, then YES, it is handed off to the disk
with some caveats, namely:

A - We are assuming the Python sqlite3 wrapper you use doesn't do
obfuscation of any sort and directly calls the sqlite3 API and returns
the direct result from those calls. If you haven't yet, consider using
(or at least trying) the APSW wrapper.

B - When SQLite reports back SQLITE_OK after a commit, it means that
SQLite has handed off the fsync() to the OS. Some OSes may lie about
this having been effected yet (in the interest of speed) and also,
perhaps even unbeknown to the OS, the hardware itself may lie about
having actually committed those bits to platters/NVRam/etc. Implementing
a good database system is as much about making the hardware comply as
making the software work.  That said, data failures/corruption due to
this B caveat is extremely improbable and only ever happens in the
weirdest of circumstances.

>
> Also, say that power is lost during the *second* conn.commit(). Is
> there a possibility that the database will be rolled back to the state
> before the first INSERT, or is it guaranteed that it will be rolled
> back to the state before the second INSERT?

Again, if the first Commit returned with SQLITE_OK (with noting the
above caveats again), then no, that state is forever captured and part
of the DB now, and any new transaction will have its own atomicity.


[1] What I am not seeing in your code is conn.startTransaction() (or
however that will be named for you). How does the connection know you
have meant for a transaction to start so as to commit it later? I'm also
not seeing you testing any return value from those conn.commit() calls,
but that might just be for brevity of the post. They can however return
a failure code, in which case the commit didn't happen.


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Commit and rollback behaviour during power loss

Blagovest Buyukliev
> If Commit returns with SQLITE_OK, then YES, it is handed off to the disk
> with some caveats, namely:
>
> A - We are assuming the Python sqlite3 wrapper you use doesn't do
> obfuscation of any sort and directly calls the sqlite3 API and returns the
> direct result from those calls. If you haven't yet, consider using (or at
> least trying) the APSW wrapper.

Some inspection of the Python wrapper would reveal that. The important
thing you note is that this condition can be distinguished from the C
API.

> B - When SQLite reports back SQLITE_OK after a commit, it means that SQLite
> has handed off the fsync() to the OS. Some OSes may lie about this having
> been effected yet (in the interest of speed) and also, perhaps even
> unbeknown to the OS, the hardware itself may lie about having actually
> committed those bits to platters/NVRam/etc. Implementing a good database
> system is as much about making the hardware comply as making the software
> work.  That said, data failures/corruption due to this B caveat is extremely
> improbable and only ever happens in the weirdest of circumstances.

The hardware and the OS are pretty much a fixed combination, since
this will be used in an in-house hosted solution with Linux and some
chosen enterprise-level storage.

> [1] What I am not seeing in your code is conn.startTransaction() (or however
> that will be named for you). How does the connection know you have meant for
> a transaction to start so as to commit it later? I'm also not seeing you
> testing any return value from those conn.commit() calls, but that might just
> be for brevity of the post. They can however return a failure code, in which
> case the commit didn't happen.

Yes, the code is purely illustrative and lacks any error handling for brevity.

> Again, if the first Commit returned with SQLITE_OK (with noting the above
> caveats again), then no, that state is forever captured and part of the DB
> now, and any new transaction will have its own atomicity.

That's a very good thing which renders our potential solution viable.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Commit and rollback behaviour during power loss

Rowan Worth-2
In reply to this post by Blagovest Buyukliev
There is one degenerate case, which has been discussed a few times on this
list. With PRAGMA journal_mode=DELETE (the default), the atomic signal that
marks a transaction being committed is the deletion of the rollback
journal. Deleting a file is a directory level operation, which means there
are three files being changed:

1. The main DB file
2. The rollback journal
3. The directory

By default sqlite fsyncs (1) and (2) as part of COMMIT, but not (3). This
makes it possible for your program's control flow to get an SQLITE_OK
result from "conn.commit()" but then lose power before the OS syncs the
directory change to disk. As a result the transaction will be rolled back
when the DB is next opened.

It's a pretty extreme scenario, but you can avoid the risk via PRAGMA
synchronous=EXTRA (to also sync the directory), or PRAGMA
journal_mode=truncate (avoiding the need to modify the directory).


I don't think there is any scenario in which power loss could result in
multiple transactions being rolled back (unless fsync lies of course, in
which case all bets are off).

-Rowan

On 24 November 2017 at 16:47, Blagovest Buyukliev <
[hidden email]> wrote:

> Let's say we have the following Python code:
>
> import sqlite3
> conn = sqlite3.connect('mydb.db')
> c = conn.cursor()
> c.execute("INSERT INTO ...")
> conn.commit()
> c.execute("INSERT INTO ...")
> conn.commit()
>
> Can it be assumed that after conn.commit() has returned, fsync() has
> been called on the file and (assuming that fsync doesn't lie and
> doesn't just act as a barrier), the last inserted row has made its way
> to disk?
>
> Also, say that power is lost during the *second* conn.commit(). Is
> there a possibility that the database will be rolled back to the state
> before the first INSERT, or is it guaranteed that it will be rolled
> back to the state before the second INSERT?
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users