Atomic DELETE index optimisation?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Keith Medcalf

Hmmm.  4.6 GB in the Filesystem cache (the Level 2 cache), and only 20000 pages (78 MB or thereabouts) in the process pagecache (the Level 1 cache).  And with only 4.6 GB in the Level 2 cache, the I/O rate drops to almost zero.  If you "moved" that memory (or some of it) from Level 2 to Level 1 you would increase performance tremendously.

pragma cache_size=262144; -- 1 GB page cache

Will increase the page cache to 1 GB.  Might not be big enough, but should be much better.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Dinu
>Sent: Sunday, 17 December, 2017 15:55
>To: [hidden email]
>Subject: Re: [sqlite] Atomic DELETE index optimisation?
>
>SQLite 3.19.3
>CentOS 7 (64 bit)
>RAM: 6G total, 4.4G buff/cache, 0.6G unused
>Sqlite memory: 133M RES, 0 SWAP, 0 DIRTY - don't know if it's
>relevant
>anymore, the query seems to have entered into another execution
>phase, it
>looks like now it's flushing the WAL.
>No other PRAGMA
>No FKs, no triggers.
>
>
>
>
>
>
>--
>Sent from: http://sqlite.1065341.n5.nabble.com/
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Rowan Worth-2
In reply to this post by Dinu
On 18 December 2017 at 06:59, Dinu <[hidden email]> wrote:

> Richard Hipp-3 wrote
> > Can you try this:
>
> (1) BEGIN TRANSACTION
> > (2) DROP all indexes from the table
> > (3) Do the DELETE operation
> > (4) CREATE all indexes again
> > (5) COMMIT
> Thanks Richard, as mentioned earlier, any structure change is unacceptable
> due to concurrent reader clients.
>

I'm not sure what you're worried about? Dropping and recreating identical
indices within a transaction won't cause a visible structure change to
concurrent readers -- that's the point of a transaction. Unless perhaps
those readers are soooo old they're using the deprecated sqlite3_prepare
rather than prepare_v2, and fall over when encountering SQLITE_SCHEMA?

-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Atomic DELETE index optimisation?

Hick Gunter
In reply to this post by Dinu
If I understand correctly, you are running a single statement equivalent to

DELETE FROM <table> WHERE <mostly_true>;

Since SQLite can drop a whole table very much faster than deleting ist rows one by one, try:

BEGIN;
CREATE TABLE temp_data AS SELECT * FROM old_data WHERE <want_to_keep>;
DROP TABLE old_data;
ALTER TABLE temp_data RENAME TO old_data;
CREATE INDEX ...; for all indices of your table
COMMIT;

Finding the rows you want to keep should be fast enough, since any helpful index is still present. Dropping the whole table plus ist associated indices is a very fast operation (just a few changes to sqlite3_master and the free page list). Recreating the index afterwards will be much faster than seeking out and destroying the majority of rows and index entries.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Dinu
Gesendet: Sonntag, 17. Dezember 2017 23:02
An: [hidden email]
Betreff: [EXTERNAL] Re: [sqlite] Atomic DELETE index optimisation?

Ok, so to address the size of the problem properly:

We have ~32M records out of which we are deleting about ~24M.
The raw size of the DB is about 20G out of which 15G goes away. Under these circumstances any page cache becomes irrelevant.
The delete operation, which is a single delete run from the sqlite3 tool, is running for 3 hours now.
The WAL size is in excess of 9G atm.
The WAL size is growing ever slower (was about 10M/s, now it's 1M every 2s, slowing down). This indicates to me a non-linear process that I can link only to the B-trees, it's the only non-linear component I can think of that could cause this slowing-down-to-a-drag. The CPU is capped up badly, the HDD is at idle level so this also hints to the same issue.

In reply to your remarks:
- I understand the B-trees need to be modified. However, if you prune, (maybe rebalance), write-to-disk every node at a time, that is hugely stressful and inefficient when you are pruning half of a 32M nodes tree.
Since the operation is atomic, the indexes could be updated one time and one time only.

So, what to do? I think this delete may never end... by the time it took to write this reply, the WAL grow has succombed to 1M every 4s.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Atomic DELETE index optimisation?

Hick Gunter
In reply to this post by Dinu
SQLite does this too (I'm not sure about the "sort rowid" bit, but it would seem reasonable); and similarly for an update, it will first SELECT the affected rows in their result form and insert them all later.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Dinu
Gesendet: Sonntag, 17. Dezember 2017 23:13
An: [hidden email]
Betreff: [EXTERNAL] Re: [sqlite] Atomic DELETE index optimisation?

By "one time only", I mean in my understanding, the way most DBs do on a DELETE is this: cache the ROWIDs while deleting data rows from the main and from the indexes, then when all ROWIDS are explored, sort the ROWID stream, and prune the trees from a sorted stream. This is both highly efficient (just like inserts, deletes of already ordered records are very efficient) and highly parallelizable.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

R Smith-2
In reply to this post by Dinu

On 2017/12/18 12:59 AM, Dinu wrote:
> Thanks Richard, as mentioned earlier, any structure change is
> unacceptable due to concurrent reader clients.

Within a transaction it matters none to those concurrent readers -
that's the entire point of the WAL + Transaction. As long as you do all
the Index dropping and recreating within the same transaction, no reader
will be affected or any wiser.

 From the point of the reader, there would be no actual schema change.
It would be impossible to fail on one and succeed on the other (unless
your actual CREATE INDEX clauses differ from their predecessors or it
has been taught to fail when the schema version counter goes up, but
that would be the worst design in DB history, so I doubt it).

Also, I assume you have copied the file (taken it off-line) and is
experimenting on it, yes?

If so, set up some concurrent readers, and do the transaction Richard
suggested. If that fails (and it won't) then come back and share with
us, because then we will know some other Gremlin is afoot, but right now
it seems you are hampered by the preconceived notions of what the DB is
doing (or how schema changes will affect things on the inside) via
Sherlockian deduction rather than actual scientific testing - This is
not a judgement, we all do it, repeating tests at those sizes for hours
and hours is not my favourite thing, but I've been wrong enough times
and learned the hard way.


Cheers,
Ryan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Dinu
In reply to this post by Rowan Worth-2
Rowan Worth-2 wrote
> I'm not sure what you're worried about? Dropping and recreating identical
> indices within a transaction won't cause a visible structure change to
> concurrent readers -- that's the point of a transaction.

I honestly don't see how in any DB system the client process would not crash
if the index it's running a curson on were to be removed. Even if SQLite
were to pull this magic out of the hat, starving client processes for the
lack of an index (a full scan query would probably take in excess of 30s)
would quickly pile up the clients to the point where one would have to kill
them anyway.
So with this in mind, I'm really not looking for a barbaric fix to this, I'm
more of tryng to understand the problem and find a viable, semantically
stable solution (and maybe trigger some improvements in SQLite, if there's a
system bug).




--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Paul Sanderson
Not sure how relevant it might be, but what page size is the DB set to and
what is the average size of a record?


Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 18 December 2017 at 10:03, Dinu <[hidden email]> wrote:

> Rowan Worth-2 wrote
> > I'm not sure what you're worried about? Dropping and recreating identical
> > indices within a transaction won't cause a visible structure change to
> > concurrent readers -- that's the point of a transaction.
>
> I honestly don't see how in any DB system the client process would not
> crash
> if the index it's running a curson on were to be removed. Even if SQLite
> were to pull this magic out of the hat, starving client processes for the
> lack of an index (a full scan query would probably take in excess of 30s)
> would quickly pile up the clients to the point where one would have to kill
> them anyway.
> So with this in mind, I'm really not looking for a barbaric fix to this,
> I'm
> more of tryng to understand the problem and find a viable, semantically
> stable solution (and maybe trigger some improvements in SQLite, if there's
> a
> system bug).
>
>
>
>
> --
> Sent from: http://sqlite.1065341.n5.nabble.com/
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Rowan Worth-2
In reply to this post by Dinu
On 18 December 2017 at 18:03, Dinu <[hidden email]> wrote:

> Rowan Worth-2 wrote
> > I'm not sure what you're worried about? Dropping and recreating identical
> > indices within a transaction won't cause a visible structure change to
> > concurrent readers -- that's the point of a transaction.
>
> I honestly don't see how in any DB system the client process would not
> crash
> if the index it's running a curson on were to be removed. Even if SQLite
> were to pull this magic out of the hat,


The entire point of SQL transactions is to implement this magic! All other
clients see the state either before or after the transaction.


> starving client processes for the
> lack of an index (a full scan query would probably take in excess of 30s)
> would quickly pile up the clients to the point where one would have to kill
> them anyway.
>

The other clients won't run without an index. They won't have their current
index ripped out from under them either.

They _will_ have to wait for the deletion transaction to finish up. But
that's an issue regardless - once the delete transaction's memory cache
spills, it will have exclusive access to the database until the transaction
COMMI-- ah but wait, I'm talking about rollback journal mode, because
that's what I'm familiar with. WAL has different concurrency
characteristics. I'll leave the details to someone else, but I'm 100% sure
you still get atomic transactions.


> So with this in mind, I'm really not looking for a barbaric fix to this,
> I'm
> more of tryng to understand the problem and find a viable, semantically
> stable solution (and maybe trigger some improvements in SQLite, if there's
> a
> system bug).
>

Seeing what effect dropping/recreating the indices has on the overall speed
_IS_ a step in understanding the problem, no?
-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Atomic DELETE index optimisation?

Dinu
In reply to this post by Hick Gunter
Hick Gunter wrote
> SQLite does this too

Thanks!




--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Dinu
In reply to this post by Keith Medcalf
Keith Medcalf wrote
> If you "moved" that memory (or some of it) from Level 2 to Level 1 you
> would increase performance tremendously.
>
> pragma cache_size=262144; -- 1 GB page cache

Thanks, I will try that!




--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Atomic DELETE index optimisation?

Hick Gunter
In reply to this post by Dinu
As you are running in WAL mode, readers will still be seeing the state of the data BEFORE you started the delete transaction. This holds true also for readers that run while your delete transaction is running. Nobody is pulling out the rug from under any transaction at all.

Even though you are running in WAL mode, any client process attempting to write will still be blocked until AFTER the delete transaction FINISHES. As you have discovered, this means HOURS in your case.

The "total replacement" algorithm should minimize the overall effort for your "bulk delete" operation. The improvements may be several orders of magnitude, but you will have to find out the exact scale for your data and your application and decide if the improved processing time is compatible with your requirements.

The "piecemeal delete" approach requires adapting the delete statement(s) so that the time taken fits into your requirements, e.g.

DELETE FROM table WHERE rowid IN (SELECT rowid FROM table WHERE <deleteable> LIMIT <quickdeletelimit>);

With <quickdeletelimit> replaced by the number of rows that gives an acceptable latency for concurrent updates. And delete transactions spaced out, so that other clients hava chance to perform their writes too.

Current websites attempt to achieve 500ms reaction times, so maybe running a delete that takes 100ms every second will do...

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Dinu
Gesendet: Montag, 18. Dezember 2017 11:03
An: [hidden email]
Betreff: [EXTERNAL] Re: [sqlite] Atomic DELETE index optimisation?

Rowan Worth-2 wrote
> I'm not sure what you're worried about? Dropping and recreating
> identical indices within a transaction won't cause a visible structure
> change to concurrent readers -- that's the point of a transaction.

I honestly don't see how in any DB system the client process would not crash if the index it's running a curson on were to be removed. Even if SQLite were to pull this magic out of the hat, starving client processes for the lack of an index (a full scan query would probably take in excess of 30s) would quickly pile up the clients to the point where one would have to kill them anyway.
So with this in mind, I'm really not looking for a barbaric fix to this, I'm more of tryng to understand the problem and find a viable, semantically stable solution (and maybe trigger some improvements in SQLite, if there's a system bug).




--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Dinu
In reply to this post by Rowan Worth-2
Rowan Worth-2 wrote
> The entire point of SQL transactions is to implement this magic!

I beg to differ; you proposed:
> (1) BEGIN TRANSACTION
> > (2) DROP all indexes from the table
This is by no means valid SQL semantics; in all RDBBMS I've worked with,
structure changes' relation to a transaction is undefined at best. Even if
it works now, there's no guarantee the "BEGIN; DROP" behavior will be
consistent any time in the future. So in repect to this, the alternative of
copying to a different table, TRUNCATE, copy back, looks much more
semantically acceptable.

Rowan Worth-2 wrote
> WAL has different concurrency characteristics.

Yeap, it's supposed to do just this, keep readers from starving until a
write is complete; the WAL flush works quite well it seems, keeping the HDD
at a respectable 100% and so the WAL flush will take less than 30s.
Populating the table, on the other hand, takes much longer.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Richard Hipp-3
On 12/18/17, Dinu <[hidden email]> wrote:
> Rowan Worth-2 wrote
>> The entire point of SQL transactions is to implement this magic!
>
> I beg to differ; you proposed:
>> (1) BEGIN TRANSACTION
>> > (2) DROP all indexes from the table
> This is by no means valid SQL semantics; in all RDBBMS I've worked with,
> structure changes' relation to a transaction is undefined at best.

DROP, CREATE, and ALTER are transactional in SQLite, just like DELETE,
INSERT, and UPDATE.  This has always been the case, and always shall
be.
--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Dinu
Richard Hipp-3 wrote
> DROP, CREATE, and ALTER are transactional in SQLite, just like DELETE,
> INSERT, and UPDATE.  This has always been the case, and always shall
> be.

Thanks! But still, isn't this incompatible with any lower-than-DB-level
transaction locking mechanism? I.E. should you ever have row-level locking,
this absolutely needs a persistent, cross-connection ROWID index; while any
FOR UPDATE locking semantics need persistent gap-locking indexes... Just a
thought for the distant future, I realize it's not a discussion to have now
:)


Keith Medcalf wrote
> pragma cache_size=262144; -- 1 GB page cache

Actually I realized that the DB page size is 1K. Is this bad? I tried to run
the pragma query with 1M pages, to amount to the same 1G; there seems to be
a dramatic improvement in throughput at the beginning of the query, but it
quickly succombs to nonlinear slow-down-to-a-drag nonetheless.

Richard, leaving alternate strategies aside (DROP, TRUNCATE etc.), what
could be the reason behing this non-linear delete behavior? Why does it slow
down to a grinding halt? It would be tremendously helping for me to know; we
are in the process of migrating more than just this table to SQLite so
knowledge about the inner workings of SQLite helps us tremendously. Thanks!



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Simon Slavin-3
In reply to this post by Dinu


On 18 Dec 2017, at 10:03am, Dinu <[hidden email]> wrote:

> I honestly don't see how in any DB system the client process would not crash
> if the index it's running a curson on were to be removed.

SQLite doesn’t run cursors.  There are no cursor commands in the SQLite API.

SQLite does not lock tables or indexes.  If anything needs locking (for example, if a transaction starts to make a change) then the entire database is locked.  If a connection regains access to its data (because another connection has released its lock) it does not assume nothing has changed.  This cannot be changed without a radical rethink and rewrite  of SQLite.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Simon Slavin-3
In reply to this post by Dinu
On 18 Dec 2017, at 12:28pm, Dinu <[hidden email]> wrote:

> Actually I realized that the DB page size is 1K. Is this bad? I tried to run
> the pragma query with 1M pages, to amount to the same 1G; there seems to be
> a dramatic improvement in throughput at the beginning of the query,

The problem is not the page size, or the total number of pages.  The problem is the total data size (e.g. number of pages times page size).

> but it
> quickly succombs to nonlinear slow-down-to-a-drag nonetheless.
>
> Richard, leaving alternate strategies aside (DROP, TRUNCATE etc.), what
> could be the reason behing this non-linear delete behavior?

This is not behaviour built into the source code for SQLite.  What you are seeing is the result of the size of a cache.  If the changes fit within a certain cache size.  Once you’ve bust the cache things slow down.

Which cache is being bust can be harder to figure out.  There are at least three involved, and only one of them is under SQLite control.

In this thread we’ve given you several things to check.  The latest seem to be:

1) Change your journal mode from WAL to PERSIST (or vice versa) and see if that helps.

2) Try this:

BEGIN IMMEDIATE
    create a temporary table with the same columns as MyTable, but no indexes
    copy the rows you need to keep to the temporary table
    DELETE FROM MyTable
    INSERT INTO MyTable (SELECT * FROM TempTable)
COMMIT

Have you tried these things ?  Did the time taken improve or get worse ?

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Dinu
Simon Slavin-3 wrote
> Once you’ve bust the cache things slow down.

I do realize that. However, not illinearly. Once I bust the cache,
throughput should drop X times and stay there. Here, the speed decreases
with the progress. Let me put it this way: say the cache size was 0. I would
expect the delete to work very slow, but near-linear with the # of rows
being deleted or the progress of deleting them. Here the performance
dramatically and constantly decreases with growing of the WAL file. You can
literally see how every M of throughput is slower than the previous :) I am
trying to find out why. For me it's very important for the migration
process.


Simon Slavin-3 wrote
> Have you tried these things ?  Did the time taken improve or get worse ?

Not yet, we will probably implement what you proposed (it's been suggested
before). But if I can find out the reason behind the nonlinear DELETE
behavior, it would still help greatly.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atomic DELETE index optimisation?

Dinu
In reply to this post by Simon Slavin-3
Simon Slavin-3 wrote
>     DELETE FROM MyTable

We have tried it and DELETE FROM table (without any WHERE) behaves exactly
the same! I reiterate there are no FKs or triggers defined. So this is no
fix...



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Atomic DELETE index optimisation?

Hick Gunter
In reply to this post by Dinu
I guess you might be runing into the effect described here http://sqlite.org/wal.html :

" Very large write transactions. A checkpoint can only complete when no other transactions are running, which means the WAL file cannot be reset in the middle of a write transaction. So a large change to a large database might result in a large WAL file. The WAL file will be checkpointed once the write transaction completes (assuming there are no other readers blocking it) but in the meantime, the file can grow very big.

As of SQLite version 3.11.0 (2016-02-15), the WAL file for a single transaction should be proportional in size to the transaction itself. Pages that are changed by the transaction should only be written into the WAL file once. However, with older versions of SQLite, the same page might be written into the WAL file multiple times if the transaction grows larger than the page cache. "

Since your transaction deletes about 75% of your records, it is reasonable to assume that every single page will be modified. If you are runnning a version prior to 3.11.0, the same page may be written to the WAL file more than once after the page cache is overwhelmed by the transaction size. If you are running an SQLite version 3.11.0 or newer, only pages that are changed for the first time in the transaction are copied to the WAL file, thus extending it.

While progressing the transaction, the rate of "newly changed" to "changed again" pages will shift towards re-reading and rewriting the WAL file copy of a page; this will tend to slow down the rate at which the WAL file is growing, even at a constant delete rate, until every page has been updated at least once, and then stop growing until all the remaining deletes have been processed.

Thus, your underlying notion that delete rate corresponds to WAL file growth rate is plain wrong. To measure delete rates, you would have to register an sqlite3_update_hook() function. I expect you would find that the delete rate increases as a function of decreasing record count, quite to the contrary of what you are "measuring".


-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Dinu
Gesendet: Montag, 18. Dezember 2017 14:06
An: [hidden email]
Betreff: [EXTERNAL] Re: [sqlite] Atomic DELETE index optimisation?

Simon Slavin-3 wrote
> Once you’ve bust the cache things slow down.

I do realize that. However, not illinearly. Once I bust the cache, throughput should drop X times and stay there. Here, the speed decreases with the progress. Let me put it this way: say the cache size was 0. I would expect the delete to work very slow, but near-linear with the # of rows being deleted or the progress of deleting them. Here the performance dramatically and constantly decreases with growing of the WAL file. You can literally see how every M of throughput is slower than the previous :) I am trying to find out why. For me it's very important for the migration process.


Simon Slavin-3 wrote
> Have you tried these things ?  Did the time taken improve or get worse ?

Not yet, we will probably implement what you proposed (it's been suggested before). But if I can find out the reason behind the nonlinear DELETE behavior, it would still help greatly.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Re: Atomic DELETE index optimisation?

Dinu
Hick Gunter wrote

> If you are running an SQLite version 3.11.0 or newer, only pages that are
> changed for the first time in the transaction are copied to the WAL file,
> thus extending it.
>
> While progressing the transaction, the rate of "newly changed" to "changed
> again" pages will shift towards re-reading and rewriting the WAL file copy
> of a page; this will tend to slow down the rate at which the WAL file is
> growing, even at a constant delete rate, until every page has been updated
> at least once, and then stop growing until all the remaining deletes have
> been processed.

Running 3.19; thanks for the explanation, this never occured to me and makes
perfect sense.




--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
123