Explicit "read transaction" with journal_mode=WAL.

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] Re: Explicit "read transaction" with journal_mode=WAL.

Simon Slavin-3
On 31 Jul 2019, at 12:57pm, test user <[hidden email]> wrote:

> Is there a standard place where people can request features to be added to SQLite?

Here.  You've already done it.  The developers of SQLite read this list and will consider the things you wanted to do and whether it's worth providing a better way to do them.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] Re: Explicit "read transaction" with journal_mode=WAL.

Richard Hipp-3
On 7/31/19, Simon Slavin <[hidden email]> wrote:
> On 31 Jul 2019, at 12:57pm, test user <[hidden email]> wrote:
>
>> Is there a standard place where people can request features to be added to
>> SQLite?
>
> Here.  You've already done it.  The developers of SQLite read this list and
> will consider the things you wanted to do and whether it's worth providing a
> better way to do them.

Just to confirm: Simon is exactly correct.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Igor Tandetnik-2
In reply to this post by test user
On 7/31/2019 6:36 AM, test user wrote:
> As an example, the client of this library could:
>
> - A. Obtain a "read transaction", *without running any SELECTs*.
> - B. Complete 20 write transactions in another process.
> - C. Begin reading from the read transaction (A) at the point before the
> transactions had occurred.

In the current world, that client would execute BEGIN, then wait a bit, then start reading and discover the data written by another process. How does it know that those writes occurred between BEGIN and SELECT, and not before BEGIN? Why would that distinction matter to it in the first place?

Do you envision some other channel of communication and synchronization between these two processes, outside the SQLite database, that would help establish that writes occurred after BEGIN? With SQLite alone, it's impossible to tell whether the sequence of events was A-B-C or B-A-C - both sequences produce the exact same observable behavior. So guarding against B squeezing between A and C seems rather pointless.
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Larry Brasfield
In reply to this post by test user
Keith writes:

I do not personally see the benefit of moving the repeatable read guarantee to the BEGIN point rather than the first database read after the BEGIN because only fully committed transactions will be visible anyway -- …

Improved modularity would be facilitated by making it possible to actually begin a transaction immediately rather than having it be deferred until the associated DB access occurs.  Granted, when the sequence of events (BEGIN …, SELECT …, … TRANSACTION)  is viewed as a whole, deferring the guarantee makes only a slight difference to the work that must (or should) be done by the application.  But the deferral changes where checking must be done for actually obtaining the guarantee.  Presently, that checking has to be done in the code which makes the queries (or updates, inserts, etc.)  And that checking, which is needed on the first access statement only, is probably done with different requirements for handling the contention error than pertain to following statements.

Pseudo-code examples may clarify my point.

Now:
  Begin transaction;
  if ( failed( do first access ) ) {
    Handle contention error or more unusual errors;
    Rollback transaction;
  }
  else {
    if ( failed ( do subsequent accesses ) ) {
      Handle the rare and arcane access errors;
      Rollback transaction;
    }
    else {
      Commit transaction;
    }
  }

With guarantee moved up to BEGIN:
  If ( failed( begin transaction ) ) {
    Handle contention error;
  }
  else {
    if ( failed ( do must-be-grouped accesses ) ) {
      Handle the rare and arcane access errors;
      Rollback transaction;
    }
    else {
      Commit transaction;
    }
  }

When using C++, C# or another language with object lifetime guarantees, I might wrap the contention checking, waiting, etc., into a Transaction class which bundled the transaction mechanics with (some of) the error handling a contention failure entails.  The client code would read something like:
  using (var t = new Transaction(dbConnection) ) {
    if ( failed ( do must-be-grouped accesses ) ) {
      Handle access errors
      // t does the rollback when it goes out of scope.
    }
    else {
      t.Commit();
    }
  }
This grouping of handling for different error categories is made much less convenient by the present deferral of acquiring the necessary lock(s).

Best regards,
-
Larry Brasfield
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] Re: Explicit "read transaction" with journal_mode=WAL.

Thomas Kurz
In reply to this post by Richard Hipp-3
Would it be possible for you to give some feedback (just an estimation) whether or not a suggestion might be considered?


----- Original Message -----
From: Richard Hipp <[hidden email]>
To: SQLite mailing list <[hidden email]>
Sent: Wednesday, July 31, 2019, 16:10:13
Subject: [sqlite] [SPAM?] Re: Explicit "read transaction" with journal_mode=WAL.

On 7/31/19, Simon Slavin <[hidden email]> wrote:
> On 31 Jul 2019, at 12:57pm, test user <[hidden email]> wrote:

>> Is there a standard place where people can request features to be added to
>> SQLite?

> Here.  You've already done it.  The developers of SQLite read this list and
> will consider the things you wanted to do and whether it's worth providing a
> better way to do them.

Just to confirm: Simon is exactly correct.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Simon Slavin-3
In reply to this post by Larry Brasfield
On 31 Jul 2019, at 5:04pm, Larry Brasfield <[hidden email]> wrote:

> I do not personally see the benefit of moving the repeatable read guarantee to the BEGIN point rather than the first database read after the BEGIN because only fully committed transactions will be visible anyway -- …

I can imagine two programs, or two processes, might be communicating in some other way in addition to passing data through a SQLite database.  So they need synchrony.

Consider a bank which takes an audit every day at close-of-business.  This might be declared to be 5pm.  However, accounts are continued to be debited and credited all night, due to interest being added, ATM transactions, etc..  Nevertheless, the audit needs to see a snapshot as of 5pm.

Of course, no bank would be using SQLite for this purpose, because a bank would be using a server/client DBMS.  But you get the idea.

But I think the BEGIN command can be retrofitted without breaking backward compatibility.  The words WRITE and IMMEDIATE should be seen as options.  WRITE means that you want a write lock as well as a read lock.  IMMEDIATE means that you want it now, rather than when the first command of the transaction requires a lock.  Use neither, either, or both.  And EXCLUSIVE gets parsed as IMMEDIATE WRITE.  Thus existing programs continue to do the same thing they always did.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

test user
In reply to this post by Igor Tandetnik-2
Quote: Why would that distinction matter to it in the first place?

So its clear in the API what lock you have at what time.

This would make predicting what happens in concurrent scenarios much easier.

An explicit "read transaction" is a single line of a program.

With an implicit "read transaction", you must know the semantics of SQLite
locking, and look to see where your first SELECT returns SQLITE_OK.


With read transaction:
```
c1.startRead((tx)=>{ // Issues a BEGIN READ
    // X.
    sleep(5 seconds);
    // SELECT... still the same snapshot from point X being read.
});
```


*No* explicit read transaction:
```
c1.startRead((tx)=>{ // Just issues a normal BEGIN
    // X.
    sleep(5 seconds);
    // SELECT.... What ever was written in the last 5 seconds will be in
the result set, which is unexpected.
});
```


With a explicit "read transaction" the programmer can assume that whenever
`startRead` returns to the runtime the snapshot is guaranteed (it does not
matter how long it takes to issue the first SELECT).


In some runtimes, the scheduling of functions is unpredictable, so although
you will not have a `sleep 5` in the code, the runtime can produce this
effect on loaded systems or with programs with long running sync functions.


An example of how you might use this:
- Connection 1: Get a read transaction.

- Connection 2: Get a write transaction, write a lot of data, commit.
- Connection 2: SELECT report summary B.

- Connection 1: SELECT report summary A,

- Diff A and B to see what changed.

On Wed, Jul 31, 2019 at 3:24 PM Igor Tandetnik <[hidden email]> wrote:

> On 7/31/2019 6:36 AM, test user wrote:
> > As an example, the client of this library could:
> >
> > - A. Obtain a "read transaction", *without running any SELECTs*.
> > - B. Complete 20 write transactions in another process.
> > - C. Begin reading from the read transaction (A) at the point before the
> > transactions had occurred.
>
> In the current world, that client would execute BEGIN, then wait a bit,
> then start reading and discover the data written by another process. How
> does it know that those writes occurred between BEGIN and SELECT, and not
> before BEGIN? Why would that distinction matter to it in the first place?
>
> Do you envision some other channel of communication and synchronization
> between these two processes, outside the SQLite database, that would help
> establish that writes occurred after BEGIN? With SQLite alone, it's
> impossible to tell whether the sequence of events was A-B-C or B-A-C - both
> sequences produce the exact same observable behavior. So guarding against B
> squeezing between A and C seems rather pointless.
> --
> Igor Tandetnik
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Keith Medcalf
In reply to this post by Simon Slavin-3
On Wednesday, 31 July, 2019 10:21, Simon Slavin <[hidden email]> wrote:

>But I think the BEGIN command can be retrofitted without breaking
>backward compatibility.  The words WRITE and IMMEDIATE should be seen
>as options.  WRITE means that you want a write lock as well as a read
>lock.  IMMEDIATE means that you want it now, rather than when the
>first command of the transaction requires a lock.  Use neither,
>either, or both.  And EXCLUSIVE gets parsed as IMMEDIATE WRITE.  Thus
>existing programs continue to do the same thing they always did.

BEGIN [DEFERRED] [TRANSACTION] is the current default and means defer acquiring locks until they are required.
BEGIN IMMEDIATE [TRANSACTION] is currently implemented and acquires a SHARED and an INTENT lock immediately.
BEGIN EXCLUSIVE [TRANSACTION] is currently implemented and acquires an EXCLUSIVE lock immediately.

I believe the idea would be to permit syntax something like:

BEGIN IMMEDIATE [SHARED|[UPDATE]] [TRANSACTION]

which would allow the specification of the type of lock to immediately acquire, either a simple shared lock, or the shared and intent locks as is currently done.  The only change would then be that BEGIN IMMEDIATE SHARED [TRANSACTION] would immediately acquire a shared lock.  All else would remain unchanged.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Richard Hipp-3
In reply to this post by Thomas Kurz
On 7/31/19, Thomas Kurz <[hidden email]> wrote:
> Would it be possible for you to give some feedback (just an estimation)
> whether or not a suggestion might be considered?

Low probability at this time.

The suggestion does not provide any new capability, but it is
something that we would need to test and maintain for the next 31
years.  So it has a high cost and low benefit.
--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Olivier Mascia
In reply to this post by Keith Medcalf
> Le 31 juil. 2019 à 18:53, Keith Medcalf <[hidden email]> a écrit :
>
> I believe the idea would be to permit syntax something like:
>
> BEGIN IMMEDIATE [SHARED|[UPDATE]] [TRANSACTION]

Keith, I mostly share your view and I like the fact that the proposal uses SHARED and not READ or anything like that because the goal is not to propose some "read-only transaction".

Yet, I fail to understand why:

1) you wrote it in the above way with [SHARED|[UPDATE]] syntax instead of:

> BEGIN IMMEDIATE [SHARED|UPDATE] [TRANSACTION]

and 2) why it would even matter to make provision for the optional UPDATE token.

BEGIN IMMEDIATE [TRANSACTION]
is the current semantic (immediately get RESERVED lock, as a first *write* would do within a DEFERRED TRANSACTION)

BEGIN IMMEDIATE SHARED [TRANSACTION]
would be the new semantic (immediately get SHARED lock, as a first *read* would do within a DEFERRED TRANSACTION)

Forgetting the UPDATE sugar might make the syntax change simpler.
The key issue around this discussion, is probably that it's a syntactic and semantic change that would not be available in older versions.

Applications which have proxy code to begin transactions can very easily attempt some:
SELECT ROWID FROM SQLITE_MASTER LIMIT 0
right after successfully running BEGIN [DEFERRED] [TRANSACTION]
to emulate that IMMEDIATE SHARED feature.

Oh by the way, the syntax might be the following alternative, closer to the current syntax diagrams of SQLite.  I don't know if syntax compatibility provisions with other SQL implementations (and some of its 'standards') would dictate yours or this one.

BEGIN [DEFERRED|SHARED|IMMEDIATE|EXCLUSIVE] [TRANSACTION]

In the end, it surely is not a very necessary feature.
:)
—  
Best Regards, Meilleures salutations, Met vriendelijke groeten, Mit besten Grüßen,
Olivier Mascia


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Igor Tandetnik-2
In reply to this post by test user
On 7/31/2019 12:32 PM, test user wrote:

> In some runtimes, the scheduling of functions is unpredictable, so although
> you will not have a `sleep 5` in the code, the runtime can produce this
> effect on loaded systems or with programs with long running sync functions.
>
>
> An example of how you might use this:
> - Connection 1: Get a read transaction.
>
> - Connection 2: Get a write transaction, write a lot of data, commit.
> - Connection 2: SELECT report summary B.
>
> - Connection 1: SELECT report summary A,
>
> - Diff A and B to see what changed.

Suppose you discovered that B and A are in fact the same. How do you know whether that occurred because a) "get a read transaction" is "broken" in that it doesn't actually acquire the lock as you expected, or because b) Connection 2 just happened to write lots of data and commit before connection 1 obtained a read transaction?

In other words, in your example A == B is possible even if BEGIN worked the way you expect it to work, and grabbed a read lock immediately. Similarly, A != B is possible with BEGIN working the way it does now, if SELECT on Connection 1 just happens to beat the write on Connection 2. It's a matter of timing and scheduling, which you yourself posit is unpredictable.

So, since both A==B and A!=B are possible with either behavior of BEGIN, why again do you care how BEGIN behaves?
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] Re: Explicit "read transaction" with journal_mode=WAL.

Richard Damon
If, and it is possible for there to be sequencing to enforce it, You KNOW that step 1, get a read transaction (which the OP presumes implies includes getting the lock) occurs before step 2, and thus step 4 seeing changes from step 3 says something is wrong.

Yes, if you only can use the database to communicate, this may not be possible, but if the program does the get a read transaction itself, and only after getting the confirmation starts the process that does the write, it can positively know that it ‘started’ the read transaction before the write transaction was started.

SQLite (I believe) documents that the BEGIN statement doesn’t set the read lock until the select, so its behavior is as documented, just not as the OP desires. For them, it is desired that the beginning of the transaction also gets the lock, so the their API point of begin a transaction marks the point that the read data is locked to.

One option, that they are pursuing (but seems to have been denied) is to have SQLite have a variant of BEGIN that includes getting the read lock. Their other option is to just do a dummy SELECT on the database to force getting the lock. The callers to their API won’t be able to tell the difference, except maybe a bit of execution time.

> On Jul 31, 2019, at 4:46 PM, Igor Tandetnik <[hidden email]> wrote:
>
>> On 7/31/2019 12:32 PM, test user wrote:
>> In some runtimes, the scheduling of functions is unpredictable, so although
>> you will not have a `sleep 5` in the code, the runtime can produce this
>> effect on loaded systems or with programs with long running sync functions.
>> An example of how you might use this:
>> - Connection 1: Get a read transaction.
>> - Connection 2: Get a write transaction, write a lot of data, commit.
>> - Connection 2: SELECT report summary B.
>> - Connection 1: SELECT report summary A,
>> - Diff A and B to see what changed.
>
> Suppose you discovered that B and A are in fact the same. How do you know whether that occurred because a) "get a read transaction" is "broken" in that it doesn't actually acquire the lock as you expected, or because b) Connection 2 just happened to write lots of data and commit before connection 1 obtained a read transaction?
>
> In other words, in your example A == B is possible even if BEGIN worked the way you expect it to work, and grabbed a read lock immediately. Similarly, A != B is possible with BEGIN working the way it does now, if SELECT on Connection 1 just happens to beat the write on Connection 2. It's a matter of timing and scheduling, which you yourself posit is unpredictable.
>
> So, since both A==B and A!=B are possible with either behavior of BEGIN, why again do you care how BEGIN behaves?
> --
> Igor Tandetnik
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

test user
In reply to this post by Igor Tandetnik-2
Quote: Connection 2 just happened to write lots of data and commit before
connection 1 obtained a read transaction
Quote: if SELECT on Connection 1 just happens to beat the write on
Connection 2

- The order in the example is exact, not a guess what might happen.
- Each step runs in a single thread on an event loop and is awaited, and
assumed to return with OK.
- The read transaction at the start is definite (supposing BEGIN READ
existing).


All I am saying is on "Connection 1: Get read transaction" could be:

BEGIN READ

NOT

BEGIN; SELECT * FROM some_table;


Quote: Why again do you care how BEGIN behaves

Im just suggesting that if you can obtain a write transaction with a single
trip over the FFI and an explicit command "BEGIN IMMEDIATE" the same could
be true for "read transactions"

I care because I think its a better API design which would then allow
better higher level libraries.






On Wed, Jul 31, 2019 at 9:46 PM Igor Tandetnik <[hidden email]> wrote:

> On 7/31/2019 12:32 PM, test user wrote:
> > In some runtimes, the scheduling of functions is unpredictable, so
> although
> > you will not have a `sleep 5` in the code, the runtime can produce this
> > effect on loaded systems or with programs with long running sync
> functions.
> >
> >
> > An example of how you might use this:
> > - Connection 1: Get a read transaction.
> >
> > - Connection 2: Get a write transaction, write a lot of data, commit.
> > - Connection 2: SELECT report summary B.
> >
> > - Connection 1: SELECT report summary A,
> >
> > - Diff A and B to see what changed.
>
> Suppose you discovered that B and A are in fact the same. How do you know
> whether that occurred because a) "get a read transaction" is "broken" in
> that it doesn't actually acquire the lock as you expected, or because b)
> Connection 2 just happened to write lots of data and commit before
> connection 1 obtained a read transaction?
>
> In other words, in your example A == B is possible even if BEGIN worked
> the way you expect it to work, and grabbed a read lock immediately.
> Similarly, A != B is possible with BEGIN working the way it does now, if
> SELECT on Connection 1 just happens to beat the write on Connection 2. It's
> a matter of timing and scheduling, which you yourself posit is
> unpredictable.
>
> So, since both A==B and A!=B are possible with either behavior of BEGIN,
> why again do you care how BEGIN behaves?
> --
> Igor Tandetnik
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Explicit "read transaction" with journal_mode=WAL.

Jean-Christophe Deschamps-3
In reply to this post by Simon Slavin-3
Simon,

>Consider a bank which takes an audit every day at
>close-of-business.  This might be declared to be 5pm.  However,
>accounts are continued to be debited and credited all night, due to
>interest being added, ATM transactions, etc..  Nevertheless, the audit
>needs to see a snapshot as of 5pm.
>
>Of course, no bank would be using SQLite for this purpose, because a
>bank would be using a server/client DBMS.  But you get the idea.

I don't believe this can be any close to a real-world scenario,
client/server architecture set aside.

Being able to issue and process a BEGIN SHARED IMMEDIATE for the read
lock be in place at exactly 5pm, without ever missing a transaction
performed from elsewhere at 04:59:59:999.99 nor including a transaction
commited at 05:00:00:0.001 seems to be an impossible task in practice.

One may find it uncomfortable to ignore the delay between BEGIN is
issued and when the next SELECT gets the lock set, but in fact you
would never know either the delay between your program issuing BEGIN
SHARED IMMEDIATE and the precise moment the lock is actually setup,
unless under a low-load real-time OS providing explicit garantees on
various exec times.  And even there, I'm not that sure.

 From my remote/naive viewpoint, this is a misuse of a RDBMS relying on
DIY.  When you want/need to be sure what you are going to read is ante
<some timestamp> the only serious way is to include a precise enough
timestamp in data rows and limit the select using it.

Indeed if you would want to do that and if you need to be just on time,
you would rather use rock-solid:

select <columns> from <table> ... where timestamp between
<begin-period> and <end-period>

I still fail to imagine a useful use case for such feature.

As Igor shown, A==B and A!=B are indiscernable.

Further in the thread the argument of "symetry" between BEGIN SHARED
IMMEDIATE and BEGIN IMMEDIATE is only a surface view, because the arrow
of time is one-way.  A real symetry would be a BEGIN IMMEDIATE TO BE
COMMITED BEFORE <timestamp> but that clearly doesn't make any sense.

JcD

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
12