Is WAL mode serializable?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Is WAL mode serializable?

Andy Bennett
Hi,

I'm reading the docs:

https://sqlite.org/isolation.html

...and I need some clarification!


It is easy to see that ROLLBACK mode is SERIALIZABLE because all concurrent
readers have to leave before a PENDING lock is upgraded to an EXCLUSIVE
lock.

However, the wording for WAL mode is confusing me.


isolation.html says 'all transactions in SQLite show "serializable"
isolation.' but it also says 'In WAL mode, SQLite exhibits "snapshot
isolation"'.

Snapshot Isolation and Serializable often (in other engines) mean different
things at commit time (
https://blogs.msdn.microsoft.com/craigfr/2007/05/16/serializable-vs-snapshot-isolation-level/ 
), but SQLite seems to say that the snapshot isolation is upgraded to
serializable by forbidding readers to upgrade to writers if another writer
got in before them:

'The attempt by X to escalate its transaction from a read transaction to a
write transaction fails with an SQLITE_BUSY_SNAPSHOT error because the
snapshot of the database being viewed by X is no longer the latest version
of the database.'


So far, so good.

However. the definition of SQLITE_BUSY_SNAPSHOT at
https://sqlite.org/rescode.html#busy_snapshot says:

-----
1. Process A starts a read transaction on the database and does one or more
   SELECT statement. Process A keeps the transaction open.
2. Process B updates the database, changing values previous read by process
   A.
3. Process A now tries to write to the database. But process A's view of
the
   database content is now obsolete because process B has modified the
   database file after process A read from it. Hence process A gets an
   SQLITE_BUSY_SNAPSHOT error.
-----

In particular 'Process B updates the database, changing values previous
read by process A.' seems to suggest that values read by A have to be
changed to effect the SQLITE_BUSY_SNAPSHOT error in A.


Is that last quote just imprecise writing or is there really a difference
between SQLite's Snapshot Isolation in WAL mode and its Serializable
isolation in ROLLBACK mode?


Thanks for your help!



Best wishes,
@ndy

--
[hidden email]
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is WAL mode serializable?

Keith Medcalf

On Tuesday, 9 July, 2019 20:34, Andy Bennett <[hidden email]> wrote:

>However, the wording for WAL mode is confusing me.

>isolation.html says 'all transactions in SQLite show "serializable"
>isolation.' but it also says 'In WAL mode, SQLite exhibits "snapshot
>isolation"'.

>Snapshot Isolation and Serializable often (in other engines) mean
>different things at commit time (
>https://blogs.msdn.microsoft.com/craigfr/2007/05/16/serializable-vs-
>snapshot-isolation-level/
>), but SQLite seems to say that the snapshot isolation is upgraded to
>serializable by forbidding readers to upgrade to writers if another
>writer got in before them:

>'The attempt by X to escalate its transaction from a read transaction
>to a write transaction fails with an SQLITE_BUSY_SNAPSHOT error because
>the snapshot of the database being viewed by X is no longer the latest
>version of the database.'

>So far, so good.

>However. the definition of SQLITE_BUSY_SNAPSHOT at
>https://sqlite.org/rescode.html#busy_snapshot says:

>-----
>1. Process A starts a read transaction on the database and does one
>   or more SELECT statement. Process A keeps the transaction open.

>2. Process B updates the database, changing values previous read by
>   process A.

>3. Process A now tries to write to the database. But process A's view
>   of the database content is now obsolete because process B has modified
>   the database file after process A read from it. Hence process A gets
>   an SQLITE_BUSY_SNAPSHOT error.
>-----

>In particular 'Process B updates the database, changing values
>previous read by process A.' seems to suggest that values read
>by A have to be changed to effect the SQLITE_BUSY_SNAPSHOT
>error in A.

SQLite does not track individual "rows" or "tables" being changed.  In (2) "changing values previously read by process A" means the database has changed since process A's view of the database was created.  What exactly changed is immaterial, since tracking of changes only occurs at the "whole database" level.

>Is that last quote just imprecise writing or is there really a
>difference between SQLite's Snapshot Isolation in WAL mode and
>its Serializable isolation in ROLLBACK mode?

Yes, there is a difference.  In rollback mode, a transaction merely places a shared lock on the database preventing any changes whatsoever from being committed to the database.  All outstanding transactions must be "closed" before a commit may proceed (acquire an EXCLUSIVE lock on the entire database including all its rows in all tables).  Thus "changes" to the database are "serialized" because they can only occur in series and only when nothing else is "looking" (since looking requires at least a read lock, which will prohibit the update from being committed).

WAL mode however, is different.  When you commence a "read transaction" in WAL mode you are creating a *snapshot* of the database at that time (actually, the reference to *snapshot* is incorrect, the actual isolation level is "repeatable read").  "repeatable read" markers can be thought of as "timestamps"  -- that is, a process holding a "repeatable read" lock against the database can see the database up to the time of its "timestamp" was created (when the lock was obtained).  It cannot see changes that are made *after* that time.  If another process obtains a "repeatable read" timestamp that is the same as one already obtained by another process, and then commits those changes, the original process is still holding a "repeatable read" lock on the database as it existed BEFORE the change was committed.  Thus it cannot update the database because it is seeing a "historical" version of the database and not the "current" view of the database -- that is it is looking at the database as it was at the time it obtained its "repeatable read" timestamp, and cannot be permitted to make changes since it is looking at an "old view" that does not represent an updateable state of affairs.

In effect WAL is a "stack of changes" to be applied to the database file.  When a read lock is obtained, the position in the stack is remembered.  If something else adds more data to the stack, then the original locker is no longer "top of stack" and cannot write changes to the database because it cannot "see" the changes made after it entered repeatable read isolation.

see https://www.sqlite.org/wal.html
especially 2.2 Concurrency

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is WAL mode serializable?

Barry Smith
Serialization isolation requires 'The databases ends up in the state it
would appear if every transaction were run sequentially'. It doesn't
actually require that all transactions occur sequentially, just that the
database ends up in a state as though they had.

Why do you think that SQLite's system of denying a read transaction the
ability to escalate to write if the data it has read has been modified by
another transaction violates that contract?

On Tue, 9 Jul 2019 at 20:39, Keith Medcalf <[hidden email]> wrote:

>
> On Tuesday, 9 July, 2019 20:34, Andy Bennett <[hidden email]>
> wrote:
>
> >However, the wording for WAL mode is confusing me.
>
> >isolation.html says 'all transactions in SQLite show "serializable"
> >isolation.' but it also says 'In WAL mode, SQLite exhibits "snapshot
> >isolation"'.
>
> >Snapshot Isolation and Serializable often (in other engines) mean
> >different things at commit time (
> >https://blogs.msdn.microsoft.com/craigfr/2007/05/16/serializable-vs-
> >snapshot-isolation-level/
> >), but SQLite seems to say that the snapshot isolation is upgraded to
> >serializable by forbidding readers to upgrade to writers if another
> >writer got in before them:
>
> >'The attempt by X to escalate its transaction from a read transaction
> >to a write transaction fails with an SQLITE_BUSY_SNAPSHOT error because
> >the snapshot of the database being viewed by X is no longer the latest
> >version of the database.'
>
> >So far, so good.
>
> >However. the definition of SQLITE_BUSY_SNAPSHOT at
> >https://sqlite.org/rescode.html#busy_snapshot says:
>
> >-----
> >1. Process A starts a read transaction on the database and does one
> >   or more SELECT statement. Process A keeps the transaction open.
>
> >2. Process B updates the database, changing values previous read by
> >   process A.
>
> >3. Process A now tries to write to the database. But process A's view
> >   of the database content is now obsolete because process B has modified
> >   the database file after process A read from it. Hence process A gets
> >   an SQLITE_BUSY_SNAPSHOT error.
> >-----
>
> >In particular 'Process B updates the database, changing values
> >previous read by process A.' seems to suggest that values read
> >by A have to be changed to effect the SQLITE_BUSY_SNAPSHOT
> >error in A.
>
> SQLite does not track individual "rows" or "tables" being changed.  In (2)
> "changing values previously read by process A" means the database has
> changed since process A's view of the database was created.  What exactly
> changed is immaterial, since tracking of changes only occurs at the "whole
> database" level.
>
> >Is that last quote just imprecise writing or is there really a
> >difference between SQLite's Snapshot Isolation in WAL mode and
> >its Serializable isolation in ROLLBACK mode?
>
> Yes, there is a difference.  In rollback mode, a transaction merely places
> a shared lock on the database preventing any changes whatsoever from being
> committed to the database.  All outstanding transactions must be "closed"
> before a commit may proceed (acquire an EXCLUSIVE lock on the entire
> database including all its rows in all tables).  Thus "changes" to the
> database are "serialized" because they can only occur in series and only
> when nothing else is "looking" (since looking requires at least a read
> lock, which will prohibit the update from being committed).
>
> WAL mode however, is different.  When you commence a "read transaction" in
> WAL mode you are creating a *snapshot* of the database at that time
> (actually, the reference to *snapshot* is incorrect, the actual isolation
> level is "repeatable read").  "repeatable read" markers can be thought of
> as "timestamps"  -- that is, a process holding a "repeatable read" lock
> against the database can see the database up to the time of its "timestamp"
> was created (when the lock was obtained).  It cannot see changes that are
> made *after* that time.  If another process obtains a "repeatable read"
> timestamp that is the same as one already obtained by another process, and
> then commits those changes, the original process is still holding a
> "repeatable read" lock on the database as it existed BEFORE the change was
> committed.  Thus it cannot update the database because it is seeing a
> "historical" version of the database and not the "current" view of the
> database -- that is it is looking at the database as it was at the time it
> obtained its "repeatable read" timestamp, and cannot be permitted to make
> changes since it is looking at an "old view" that does not represent an
> updateable state of affairs.
>
> In effect WAL is a "stack of changes" to be applied to the database file.
> When a read lock is obtained, the position in the stack is remembered.  If
> something else adds more data to the stack, then the original locker is no
> longer "top of stack" and cannot write changes to the database because it
> cannot "see" the changes made after it entered repeatable read isolation.
>
> see https://www.sqlite.org/wal.html
> especially 2.2 Concurrency
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is WAL mode serializable?

Andy Bennett
In reply to this post by Keith Medcalf
Hi,

>> Is that last quote just imprecise writing or is there really a
>> difference between SQLite's Snapshot Isolation in WAL mode and
>> its Serializable isolation in ROLLBACK mode?
>
> Yes, there is a difference.

...sorry, it seems it was my turn to do some sloppy writing!

Thanks for the detailed explanation of how it works. What I meant to ask
was "is there really a difference in the *semantics*?". i.e. from the user
perspective, can databases in the two different modes end up in different
states?


> ... it is looking at the database as it
> was at the time it obtained its "repeatable read" timestamp, and
> cannot be permitted to make changes since it is looking at an
> "old view" that does not represent an updateable state of
> affairs.

Semantically, I understand this to mean that, from the POV of the writer
and all other transactions, that reader was virtually kicked out in the
same way as it would have been in ROLLBACK mode but, because enough state
is available, it's allowed to stay in its bubble until it's finished with
it.



I guess there's a difference in ordering behaviour under load but that's
just to do with exactly which order ends up being chosen for the
transactions, not that it's no longer equivalent to some serial order being
chosen?

The difference in ordering seems less if all write transactions always
start with BEING IMMEDIATE tho'.



> see https://www.sqlite.org/wal.html
> especially 2.2 Concurrency

I don't think that the stuff about checkpointing, durability aside, is
relevant from a semantic POV, yes?


Thanks for your help.




Best wishes,
@ndy

--
[hidden email]
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] Re: Is WAL mode serializable?

Richard Damon
On 7/10/19 7:05 AM, Andy Bennett wrote:
>
> Thanks for the detailed explanation of how it works. What I meant to
> ask was "is there really a difference in the *semantics*?". i.e. from
> the user perspective, can databases in the two different modes end up
> in different states?

My understanding is NO, as long as the same transactions complete.

There is a significant difference in when a write transaction can start,
so code that doesn't wait properly might take different paths and thus
you end up in a different state.

In WAL mode, basically one write transaction can always be started, no
matter how many read transactions might be present, and once a write
transaction starts, no other existing read transaction can upgrade
itself to a write.

In non-WAL mode, the write transaction will need to wait for all (other)
read transactions to end before it can start, and if while it is waiting
another read transaction tries to upgrade it will be denied, but once it
starts, there can't be another read transaction to attempt an upgrade.

In both cases, if a read transaction attempting to upgrade to a write
transaction gets a busy, it needs to end the transaction and re-do its
reads before it can do its write. The differences is that in WAL mode,
one transaction doesn't need to wait for all the reads to finish before
it can start.

--
Richard Damon

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] Is WAL mode serializable?

Andy Bennett
Hi,

>> Thanks for the detailed explanation of how it works. What I meant to
>> ask was "is there really a difference in the *semantics*?". i.e. from
>> the user perspective, can databases in the two different modes end up
>> in different states?
>
> My understanding is NO, as long as the same transactions complete.

Thanks! That explanation is really great.


> In both cases, if a read transaction attempting to upgrade to a write
> transaction gets a busy, it needs to end the transaction and re-do its
> reads before it can do its write.

This is assuming that you have the defauly busy handler engaged?




Best wishes,
@ndy

--
[hidden email]
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] Re: [SPAM?] Is WAL mode serializable?

Richard Damon
On 7/10/19 7:54 AM, Andy Bennett wrote:

> Hi,
>
>>> Thanks for the detailed explanation of how it works. What I meant to
>>> ask was "is there really a difference in the *semantics*?". i.e. from
>>> the user perspective, can databases in the two different modes end up
>>> in different states?
>>
>> My understanding is NO, as long as the same transactions complete.
>
> Thanks! That explanation is really great.
>
>
>> In both cases, if a read transaction attempting to upgrade to a write
>> transaction gets a busy, it needs to end the transaction and re-do its
>> reads before it can do its write.
>
> This is assuming that you have the defauly busy handler engaged?
>
> Best wishes,
> @ndy
>
A proper busy handler will wait when it make sense to get the needed
locks, so you don't get spurious a spurious busy that you could just
retry and be able to continue.

The fundamental issue would be a process that starts with just a read
lock, and after reading some data realizes that it needs to update
something based on what it read. If when upgrading to the write lock,
you get a busy that tells you that you will NEVER be able to upgrade,
then you need to close the transaction, and when you restart, you should
forget the previous data you read (at least as it relates to what you
might want to write) as it might have changed, so you need to read it
again to be sure.

--
Richard Damon

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users