WAL grows without bounds, short concurrent writes & reads

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

WAL grows without bounds, short concurrent writes & reads

Florian Uekermann
Hi,

A very simple reproducer bash script using the sqlite3 CLI is appended
at the end.

I am using WAL mode in a setting with sequential writes and many
concurrent reads. Due to WAL mode the readers don't get blocked, which
is great and since writes are sequential, they never get blocked either.
However, I am seeing the WAL grow on every write without ever restarting
at the beginning of the file if there is a constant influx of new reads
(with limited lifetime).
This causes the WAL file to grow to many GB within minutes, even if the
database state fits into a few MB or even kB after closing all connections.

The output of "PRAGMA wal_checkpoint(PASSIVE);" usually looks like this:
"0|123|123", which I interpret as the checkpointer being caught up on
the current state. I believe the reason that new writes are appended at
the end, instead of restarting the WAL, is that while reads are
short-lived, there is always at least one going on, so the log of the
last write has to be kept, which in turn prevents a reset of the WAL.

An example read (r) write (w) pattern could look like this (b: begin, e:
end):
r1_b; w1; r2b; r1e; w2; r2b; w3; r3b; r2e ...

A solution could be to start a second WAL when the first one exceeds
some size threshold, which would allow resetting the first one after all
readers finish that started before the wal_checkpoint finished, even if
there are new writes in the second WAL. Then the roles/order of the two
WALs flipped, allowing the second WAL to be reset regardless of
read/write frequency.
I believe that would limit the total WAL size to about 2 times of the
size of writes happening within the timespan of a single read.

This solution has been suggested previously on this list by Mark
Hamburg, but the threads lack a simple reproducer and detailed problem
description.

Best regards,
Florian

Test script:

rm -f test.sqlite
./sqlite3 test.sqlite <<< "
PRAGMA journal_mode=WAL;
CREATE TABLE t (value INTEGER);
REPLACE into t (rowid, value) VALUES (1,0);
"

for i in {1..1000}
do

./sqlite3 test.sqlite <<< "
BEGIN;
SELECT value FROM t WHERE rowid=1;
.system sleep 0.2
SELECT value FROM t WHERE rowid=1;
COMMIT;
" &

sleep 0.1
./sqlite3 test.sqlite <<< "
BEGIN;
REPLACE into t (rowid, value) VALUES (1,$i);
.print inc
COMMIT;
"
wc -c test.sqlite-wal
done

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: WAL grows without bounds, short concurrent writes & reads

Dan Kennedy-4

There's code here, if you want to experiment with it:

   https://sqlite.org/src/timeline?r=wal2

Docs:

   https://sqlite.org/src/artifact/a807405a05e19a49

Dan.



On 29/3/62 01:33, Florian Uekermann wrote:

> Hi,
>
> A very simple reproducer bash script using the sqlite3 CLI is appended
> at the end.
>
> I am using WAL mode in a setting with sequential writes and many
> concurrent reads. Due to WAL mode the readers don't get blocked, which
> is great and since writes are sequential, they never get blocked either.
> However, I am seeing the WAL grow on every write without ever
> restarting at the beginning of the file if there is a constant influx
> of new reads (with limited lifetime).
> This causes the WAL file to grow to many GB within minutes, even if
> the database state fits into a few MB or even kB after closing all
> connections.
>
> The output of "PRAGMA wal_checkpoint(PASSIVE);" usually looks like
> this: "0|123|123", which I interpret as the checkpointer being caught
> up on the current state. I believe the reason that new writes are
> appended at the end, instead of restarting the WAL, is that while
> reads are short-lived, there is always at least one going on, so the
> log of the last write has to be kept, which in turn prevents a reset
> of the WAL.
>
> An example read (r) write (w) pattern could look like this (b: begin,
> e: end):
> r1_b; w1; r2b; r1e; w2; r2b; w3; r3b; r2e ...
>
> A solution could be to start a second WAL when the first one exceeds
> some size threshold, which would allow resetting the first one after
> all readers finish that started before the wal_checkpoint finished,
> even if there are new writes in the second WAL. Then the roles/order
> of the two WALs flipped, allowing the second WAL to be reset
> regardless of read/write frequency.
> I believe that would limit the total WAL size to about 2 times of the
> size of writes happening within the timespan of a single read.
>
> This solution has been suggested previously on this list by Mark
> Hamburg, but the threads lack a simple reproducer and detailed problem
> description.
>
> Best regards,
> Florian
>
> Test script:
>
> rm -f test.sqlite
> ./sqlite3 test.sqlite <<< "
> PRAGMA journal_mode=WAL;
> CREATE TABLE t (value INTEGER);
> REPLACE into t (rowid, value) VALUES (1,0);
> "
>
> for i in {1..1000}
> do
>
> ./sqlite3 test.sqlite <<< "
> BEGIN;
> SELECT value FROM t WHERE rowid=1;
> .system sleep 0.2
> SELECT value FROM t WHERE rowid=1;
> COMMIT;
> " &
>
> sleep 0.1
> ./sqlite3 test.sqlite <<< "
> BEGIN;
> REPLACE into t (rowid, value) VALUES (1,$i);
> .print inc
> COMMIT;
> "
> wc -c test.sqlite-wal
> done
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: WAL grows without bounds, short concurrent writes & reads

Joshua Thomas Wise
Dan, are there any plans to merge the wal2 branch into the trunk?


> On Mar 29, 2019, at 7:10 AM, Dan Kennedy <[hidden email]> wrote:
>
>
> There's code here, if you want to experiment with it:
>
>   https://sqlite.org/src/timeline?r=wal2
>
> Docs:
>
>   https://sqlite.org/src/artifact/a807405a05e19a49
>
> Dan.
>
>
>
> On 29/3/62 01:33, Florian Uekermann wrote:
>> Hi,
>>
>> A very simple reproducer bash script using the sqlite3 CLI is appended at the end.
>>
>> I am using WAL mode in a setting with sequential writes and many concurrent reads. Due to WAL mode the readers don't get blocked, which is great and since writes are sequential, they never get blocked either.
>> However, I am seeing the WAL grow on every write without ever restarting at the beginning of the file if there is a constant influx of new reads (with limited lifetime).
>> This causes the WAL file to grow to many GB within minutes, even if the database state fits into a few MB or even kB after closing all connections.
>>
>> The output of "PRAGMA wal_checkpoint(PASSIVE);" usually looks like this: "0|123|123", which I interpret as the checkpointer being caught up on the current state. I believe the reason that new writes are appended at the end, instead of restarting the WAL, is that while reads are short-lived, there is always at least one going on, so the log of the last write has to be kept, which in turn prevents a reset of the WAL.
>>
>> An example read (r) write (w) pattern could look like this (b: begin, e: end):
>> r1_b; w1; r2b; r1e; w2; r2b; w3; r3b; r2e ...
>>
>> A solution could be to start a second WAL when the first one exceeds some size threshold, which would allow resetting the first one after all readers finish that started before the wal_checkpoint finished, even if there are new writes in the second WAL. Then the roles/order of the two WALs flipped, allowing the second WAL to be reset regardless of read/write frequency.
>> I believe that would limit the total WAL size to about 2 times of the size of writes happening within the timespan of a single read.
>>
>> This solution has been suggested previously on this list by Mark Hamburg, but the threads lack a simple reproducer and detailed problem description.
>>
>> Best regards,
>> Florian
>>
>> Test script:
>>
>> rm -f test.sqlite
>> ./sqlite3 test.sqlite <<< "
>> PRAGMA journal_mode=WAL;
>> CREATE TABLE t (value INTEGER);
>> REPLACE into t (rowid, value) VALUES (1,0);
>> "
>>
>> for i in {1..1000}
>> do
>>
>> ./sqlite3 test.sqlite <<< "
>> BEGIN;
>> SELECT value FROM t WHERE rowid=1;
>> .system sleep 0.2
>> SELECT value FROM t WHERE rowid=1;
>> COMMIT;
>> " &
>>
>> sleep 0.1
>> ./sqlite3 test.sqlite <<< "
>> BEGIN;
>> REPLACE into t (rowid, value) VALUES (1,$i);
>> .print inc
>> COMMIT;
>> "
>> wc -c test.sqlite-wal
>> done
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: WAL grows without bounds, short concurrent writes & reads

Richard Hipp-3
On 3/29/19, Joshua Wise <[hidden email]> wrote:
> Dan, are there any plans to merge the wal2 branch into the trunk?

No, not at this time.


--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: WAL grows without bounds, short concurrent writes & reads

Joshua Thomas Wise
Awww sadness. Regardless, thanks for the update.


> On Mar 29, 2019, at 11:07 AM, Richard Hipp <[hidden email]> wrote:
>
> On 3/29/19, Joshua Wise <[hidden email]> wrote:
>> Dan, are there any plans to merge the wal2 branch into the trunk?
>
> No, not at this time.
>
>
> --
> D. Richard Hipp
> [hidden email]
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users