Feedback on article about SQLITE_BUSY

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Feedback on article about SQLITE_BUSY

rahul
I wrote an article about my high-level understanding of `SQLITE_BUSY` errors, hoping it might help others understand concurrency in SQLite better. It covers scenarios under which the error shows up, while SQLite tries to respect its isolation guarantee.

https://www.activesphere.com/blog/2018/12/24/understanding-sqlite-busy

I’d appreciate feedback on the article. Apologies if this is the wrong place to post such content.

Thanks,
Rahul
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

Simon Slavin-3
On 8 Jan 2019, at 3:48pm, Rahul Jayaraman <[hidden email]> wrote:

> https://www.activesphere.com/blog/2018/12/24/understanding-sqlite-busy
>
> I’d appreciate feedback on the article.

I read your article, though I did not go through all the logic and check that it is correct.  I like your choice of sections, and found your diagrams to be clear.  I have three points:

First, you include in your article several direct quotes from the SQLite documentation.  It is an error to include these without labelling them as such.  I see that in one case you write "From its documentation".  Including "From SQLite's documentation" each time you quote from it would correct this problem.

Second, there are a few places where you have used just the acronym "WAL" where I would have expected you to put "WAL mode", to remind the reader that SQLite has another mode and they need to explicitly set WAL mode if they want it.

My third point would be relating to setting a timeout.  You mention setting a busy timeout late in the article, and as part of the deadlocks section, and your text jumps straight to the most complicated way of doing it: supplying your own busy handler.  This misses a more important and easier aspect.

Your article is a good overview of the subject, written for those who have studied computer science.  But it would be far more useful to programmers who have just encountered SQLITE_BUSY if near the beginning you mention one or both of

<https://www.sqlite.org/c3ref/busy_timeout.html>
<https://www.sqlite.org/pragma.html#pragma_busy_timeout>

which sets a timeout time, and tells SQLite to use its own timeout handler, rather than the programmer having to supply one.  Setting this to a few thousand milliseconds, means that most SQLite programmers never see a result of SQLITE_BUSY, without supplying a busy-handler routine, and without including complicated and difficult-to-debug retry-loops in their code.  It covers all busy situations, not just deadlock, and I feel that it's important to any discussion of SQLITE_BUSY.

Hope this helps.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

rahul
> I have three points:

Thank you, appreciate the feedback. I've updated the article with the
changes you suggested.

Rahul



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

Rowan Worth-2
In reply to this post by rahul
There seems to be a few misconceptions in the article regarding the
difference between DEFERRED/IMMEDIATE/EXCLUSIVE, and the fine details of
how different lock states interact.

Specifically, your diagrams suggest that once a writer obtains a RESERVED
lock (as happens when an IMMEDIATE transaction begins), no readers will be
able to obtain a SHARED lock. But this is not the case - while a process
holds the RESERVED lock it's only other attempts to _write_ the database
which will be met with SQLITE_BUSY.

It's only once the writer upgrades to a PENDING lock that readers will be
rejected via SQLITE_BUSY. Historical note: this wasn't part of sqlite's
original locking protocol, but was added later to solve a writer starvation
problem.

Anyway, the RESERVED -> PENDING transition only happens when either (a) the
writer is ready to commit the transaction, or (b) there's a cache spill,
ie. the transaction has modified more database page then will fit in the
configured cache (see pragmas cache_size and cache_spill for more info).


It looks like you go into more detail on the locks later in the article - I
only got to the "Shared cache mode" section. The content before that felt
misleading in terms of reader/writer locking interaction.

I think "single writer, multiple readers" is the simplest way to describe
sqlite's approach to isolation, but I'm also pretty biased because I have a
_lot_ more experience with sqlite compared to other DBs and I'm only just
getting my head around the idea of non-isolated transactions or multiple
concurrent writers!

-Rowan

On Wed, 9 Jan 2019 at 21:48, Rahul Jayaraman <[hidden email]>
wrote:

> I wrote an article about my high-level understanding of `SQLITE_BUSY`
> errors, hoping it might help others understand concurrency in SQLite
> better. It covers scenarios under which the error shows up, while SQLite
> tries to respect its isolation guarantee.
>
> https://www.activesphere.com/blog/2018/12/24/understanding-sqlite-busy
>
> I’d appreciate feedback on the article. Apologies if this is the wrong
> place to post such content.
>
> Thanks,
> Rahul
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

rahul
> There seems to be a few misconceptions in the article regarding the difference between DEFERRED/IMMEDIATE/EXCLUSIVE

Looks like I mistakes in the 2PL section, it had contradictory content about `RESERVED` locks. Thanks for your feedback, have pushed corrections.

> Specifically, your diagrams suggest that once a writer obtains a RESERVED
lock (as happens when an IMMEDIATE transaction begins), no readers will be
able to obtain a SHARED lock.

The idea was to portray a scenario which forces serial execution of transactions. If all transactions are started in either `IMMEDIATE`/`EXCLUSIVE` modes, they try to acquire `RESERVED` /`EXCLUSIVE` locks which block other `IMMEDIATE` and `EXCLUSIVE` transactions.

> It looks like you go into more detail on the locks later in the article - only got to the "Shared cache mode" section.

The sections on locking and WAL were meant to build up concepts like deadlocks and stale snapshots (discussed later in article), and the fact that SQLite can’t make progress on transactions by retrying individual queries in those cases.

Rahul
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

rahul
In reply to this post by Rowan Worth-2
> I think "single writer, multiple readers" is the simplest way to describe
sqlite's approach to isolation

I’m not sure if this summarization paints enough of a picture about how SQLite restricts interleaving of read & write operations between concurrent transactions, to guarantee isolation. For eg, in Rollback journal, a writing transaction looking to commit blocks other readers (as you mentioned), but WAL does not.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

Rowan Worth-2
On Tue, 22 Jan 2019 at 12:51, Rahul Jayaraman <[hidden email]>
wrote:

> > I think "single writer, multiple readers" is the simplest way to describe
> sqlite's approach to isolation
>
> I’m not sure if this summarization paints enough of a picture about how
> SQLite restricts interleaving of read & write operations between concurrent
> transactions, to guarantee isolation. For eg, in Rollback journal, a
> writing transaction looking to commit blocks other readers (as you
> mentioned), but WAL does not.
>

True, it's not the full story. But both "single writer" and "multiple
readers" remain true regardless of the journal mode in use. And the same
isolation semantics are provided either way, no? Transactions cannot see
any updates which occur after they have begun, and writers are serialised.

As you say, the details of how readers and writers interact differ. In
rollback mode, a writer trying to COMMIT prevents new readers from
starting. And in WAL mode, a long-running reader prevents a CHECKPOINT from
proceeding.

-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

Simon Slavin-3
On 22 Jan 2019, at 8:21am, Rowan Worth <[hidden email]> wrote:

> Transactions cannot see
> any updates which occur after they have begun, and writers are serialised.

<https://sqlite.org/pragma.html#pragma_read_uncommitted>

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

rahul
In reply to this post by Rowan Worth-2
> But both "single writer" and “multiple readers" remain true regardless of the journal mode in use. And the same isolation semantics are provided either way, no?

It remains true, but I’m not sure if it's useful to understand isolation. I think isolation could be understood from multiple ‘useful’[1] perspectives.

From a user’s perspective, the behaviour of the system is equivalent to some serial execution of transactions, giving the appearance that one transaction ran to completion before the next started. Working around the lack of this guarantee might be hard for programmers, and it’s useful to understand that the guarantee exists. This perspective is probably common to all databases which supports serializable isolation.

From an operational perspective, which describes algorithms used and implementation details. I think it’s useful to understand algorithms because different algorithms give rise to different `busy` scenarios, and having a better mental model of algorithm helps reason about some of these scenarios. I’m not sure if the ’single writer’, ‘multiple reader’ abstraction gives insight into this perspective. A weaker model might end up confusing the user, if he ends up in a scenario which he doesn’t understand.

There’s also a mathematical perspective, which involves(I think) proving that any legal history an algorithm produces is isolated.

Rahul

--------

[1]: Ref: Transaction Processing Concepts and Techniques - Jim Gray


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

Rowan Worth-2
On Tue, 22 Jan 2019 at 17:24, Rahul Jayaraman <[hidden email]>
wrote:

> From an operational perspective, which describes algorithms used and
> implementation details. I think it’s useful to understand algorithms
> because different algorithms give rise to different `busy` scenarios, and
> having a better mental model of algorithm helps reason about some of these
> scenarios. I’m not sure if the ’single writer’, ‘multiple reader’
> abstraction gives insight into this perspective. A weaker model might end
> up confusing the user, if he ends up in a scenario which he doesn’t
> understand.
>

Of course - no four words are going to be able to communicate the subtlety
of sqlite's locking protocol and the details of how it interacts with the
different journal modes. But I maintain that "single writer multiple
readers" is the simplest abstraction available to provide a solid
foundation for unfamiliar readers; YMMV.

I was also confused by this section of the article, the second sentence of
which feels inconsistent and misleading:

> NOTE: IMMEDIATE behaviour acquires a write lock which allows concurrent
> readers, but blocks other concurrent writers (discussed further in Rollback
> journal section). In the above case, since all transactions are started
> with IMMEDIATE, they behave as writers, and concurrent transactions are
> blocked thus enforcing serial execution. EXCLUSIVE behaviour acquires a
> lock which blocks concurrent readers and writers.
>
1. "In the above case, since all transactions are started with IMMEDIATE,"
-- the diagram is actually using EXCLUSIVE transactions not IMMEDIATE
2. "they behave as writers, and concurrent transactions are blocked" --
this implies to me that all transactions are blocked, and conflicts with
the first sentence which says (correctly) that a write lock allows
concurrent readers

-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

rahul
> 1. "In the above case, since all transactions are started with IMMEDIATE,” -- the diagram is actually using EXCLUSIVE transactions not IMMEDIATE
> 2. "they behave as writers, and concurrent transactions are blocked" — this implies to me that all transactions are blocked, and conflicts with the first sentence which says (correctly) that a write lock allows concurrent readers

In the context of enforcing serial execution, it doesn’t matter if all transactions are started with `IMMEDIATE` or if all transactions are started with `EXCLUSIVE`. When using these behaviours, locks are acquired at the beginning of a transaction, and are retained till commit or abort. In `BEGIN IMMEDIATE` mode, all transactions try to acquire `RESERVED` locks at the beginning, meaning there are no readers. If one transaction succeeds, other concurrent transactions get blocked.

Rahul
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

Rowan Worth-2
On Fri, 25 Jan 2019 at 13:21, Rahul Jayaraman <[hidden email]>
wrote:

> > 1. "In the above case, since all transactions are started with
> IMMEDIATE,” -- the diagram is actually using EXCLUSIVE transactions not
> IMMEDIATE
> > 2. "they behave as writers, and concurrent transactions are blocked" —
> this implies to me that all transactions are blocked, and conflicts with
> the first sentence which says (correctly) that a write lock allows
> concurrent readers
>
> In the context of enforcing serial execution, it doesn’t matter if all
> transactions are started with `IMMEDIATE` or if all transactions are
> started with `EXCLUSIVE`. When using these behaviours, locks are acquired
> at the beginning of a transaction, and are retained till commit or abort.
> In `BEGIN IMMEDIATE` mode, all transactions try to acquire `RESERVED` locks
> at the beginning, meaning there are no readers. If one transaction
> succeeds, other concurrent transactions get blocked.
>

All true. But why start with this scenario? Having no readers and every
connection explicitly opting in to IMMEDIATE/EXCLUSIVE transactions is an
unusual set of circumstances but the article doesn't clarify this and as it
is the first example it's likely that a new reader will misinterpret this
as describing the default mode of operation.

-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Feedback on article about SQLITE_BUSY

rahul
> All true. But why start with this scenario?

While trying to explain operational aspects of SQLite’s isolation implementations, serial execution seemed like the easiest to start with. Figured it might be easy to grasp first, before looking at interleaved operations followed by complications around algorithms (deadlocks, stale snapshots). I read a book sometime back, which covered isolation in a similar way (serial, locks, optimistic) and remember liking the structure from a reader’s perspective.

> and as it is the first example it's likely that a new reader will misinterpret this as describing the default mode of operation.

I do mention when starting the section, that DEFERRED mode is default.

> Having no readers and every connection explicitly opting in to IMMEDIATE/EXCLUSIVE transactions is an unusual set of circumstances but the article doesn't clarify this

I also mention that starting all transactions this way might not be performant, and then start looking at DEFERRED mode. I’m not sure if it’s unusual or not, but in my case (see conclusion of article), I ended up using this method because it was easier to implement for my setup & the performance tradeoff was acceptable.

Rahul

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users