DELETE extremely slow

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

DELETE extremely slow

Thomas Kurz
I'm using a database with 5 hierarchically strcutured tables using foreign keys. The largest table contains about 230'000 entries. My problem is that deleting in this database is extremely slow:

pragma foreign_keys=on;
pragma journal_mode=wal;
.timer on
delete from dataset;
--> Run Time: real 197993.218 user 53015.593750 sys 54056.546875

I experimentally imported the same data into a MariaDB database and tried the same operation there (without paying attention to creating any indexes, etc.). It takes only a few seconds there.

Is there something I can check or do to improve deletion speed?

Kind regards,
Thomas

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Dominique Devienne
On Thu, Oct 31, 2019 at 10:57 AM Thomas Kurz <[hidden email]> wrote:

> I'm using a database with 5 hierarchically strcutured tables using foreign
> keys. The largest table contains about 230'000 entries. My problem is that
> deleting in this database is extremely slow:
>
> pragma foreign_keys=on;
> pragma journal_mode=wal;
> .timer on
> delete from dataset;
> --> Run Time: real 197993.218 user 53015.593750 sys 54056.546875
>
> I experimentally imported the same data into a MariaDB database and tried
> the same operation there (without paying attention to creating any indexes,
> etc.). It takes only a few seconds there.
>
> Is there something I can check or do to improve deletion speed?
>

You're not describing the schema enough IMHO.
Is dataset the "top-most" table, containing the "parent" rows all other
tables references (directly or indirectly),
with all FKs having ON DELETE CASCADE?

If that's the case, without some kind of optimization in SQLite, when the
first parent row is deleted,
it triggers a cascade of deletes in "child" tables, looking for rows using
the parent row. So if your FKs
are not indexed for those column(s), that's a full table scan each time...
That's "depth first".

By analyzing the graph of FKs and their ON DELETE CASCADE state, and in the
specific case of
fully deleting the "main parent table", SQLite could decide switch to a
smarter "breadth first" delete,
but I suspect it's not a compelling enough use-case for Richard to invest
time on this.

Try indexing your FKs, and see what happens. --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Thomas Kurz
Yes, please apologize, I indeed forgot to attach the table definitions:

CREATE TABLE dataset (
    id         INTEGER  PRIMARY KEY AUTOINCREMENT
                        UNIQUE
                        NOT NULL,
    name       STRING   DEFAULT NULL
                        COLLATE NOCASE,
    is_latest  BOOLEAN  NOT NULL
                        DEFAULT 1,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE trace (
    id         INTEGER  PRIMARY KEY AUTOINCREMENT
                        UNIQUE
                        NOT NULL,
    name       STRING   DEFAULT NULL
                        COLLATE NOCASE,
    datasetid  INTEGER  REFERENCES dataset (id)
                        NOT NULL,
    quantityid INTEGER  REFERENCES quantity (id)
                        NOT NULL,
    stored     DATETIME DEFAULT NULL,
    created_at DATETIME NOT NULL
                        DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX trace_idx_01 ON trace (
    quantityid
);

CREATE INDEX trace_idx_01 ON trace (
    quantityid
);

CREATE TABLE item (
    id        INTEGER PRIMARY KEY AUTOINCREMENT
                      UNIQUE
                      NOT NULL,
    traceid   INTEGER REFERENCES trace (id)
                      NOT NULL,
    freq      BIGINT  NOT NULL,
    value     REAL    NOT NULL,
    noiseflag BOOLEAN DEFAULT NULL
);

CREATE INDEX item_idx_01 ON item (
    traceid
);

CREATE TABLE metadata (
    id        INTEGER PRIMARY KEY AUTOINCREMENT
                      UNIQUE
                      NOT NULL,
    parameter STRING  NOT NULL
                      COLLATE NOCASE,
    value     STRING  NOT NULL
                      COLLATE NOCASE,
    datasetid INTEGER DEFAULT NULL
                      REFERENCES dataset (id),
    traceid   INTEGER DEFAULT NULL
                      REFERENCES trace (id),
    itemid    INTEGER DEFAULT NULL
                      REFERENCES item (id)
);

CREATE INDEX metadata_idx_01 ON metadata (
    parameter,
    value,
    datasetid,
    traceid,
    itemid
);

CREATE INDEX metadata_idx_02 ON metadata (
    datasetid,
    traceid
);

CREATE INDEX metadata_idx_03 ON metadata (
    traceid
);

CREATE INDEX metadata_idx_04 ON metadata (
    datasetid,
    itemid
);

CREATE INDEX metadata_idx_05 ON metadata (
    traceid,
    itemid
);

CREATE INDEX metadata_idx_06 ON metadata (
    itemid
);

CREATE INDEX metadata_idx_07 ON metadata (
    datasetid,
    parameter
);

CREATE INDEX metadata_idx_08 ON metadata (
    traceid,
    parameter
);

CREATE INDEX metadata_idx_09 ON metadata (
    parameter,
    traceid
);

CREATE INDEX metadata_idx_10 ON metadata (
    parameter,
    datasetid,
    traceid,
    itemid
);

CREATE TABLE quantity (
    id         INTEGER             PRIMARY KEY AUTOINCREMENT
                                   UNIQUE
                                   NOT NULL,
    name       STRING              NOT NULL,
    unit       STRING              NOT NULL,
    sumrule    [SMALLINT UNSIGNED] DEFAULT NULL,
    created_at DATETIME            DEFAULT CURRENT_TIMESTAMP,
    UNIQUE (
        name,
        unit,
        sumrule
    )
    ON CONFLICT IGNORE
);





----- Original Message -----
From: Dominique Devienne <[hidden email]>
To: SQLite mailing list <[hidden email]>
Sent: Thursday, October 31, 2019, 11:06:07
Subject: [sqlite] DELETE extremely slow

On Thu, Oct 31, 2019 at 10:57 AM Thomas Kurz <[hidden email]> wrote:

> I'm using a database with 5 hierarchically strcutured tables using foreign
> keys. The largest table contains about 230'000 entries. My problem is that
> deleting in this database is extremely slow:

> pragma foreign_keys=on;
> pragma journal_mode=wal;
> .timer on
> delete from dataset;
> --> Run Time: real 197993.218 user 53015.593750 sys 54056.546875

> I experimentally imported the same data into a MariaDB database and tried
> the same operation there (without paying attention to creating any indexes,
> etc.). It takes only a few seconds there.

> Is there something I can check or do to improve deletion speed?


You're not describing the schema enough IMHO.
Is dataset the "top-most" table, containing the "parent" rows all other
tables references (directly or indirectly),
with all FKs having ON DELETE CASCADE?

If that's the case, without some kind of optimization in SQLite, when the
first parent row is deleted,
it triggers a cascade of deletes in "child" tables, looking for rows using
the parent row. So if your FKs
are not indexed for those column(s), that's a full table scan each time...
That's "depth first".

By analyzing the graph of FKs and their ON DELETE CASCADE state, and in the
specific case of
fully deleting the "main parent table", SQLite could decide switch to a
smarter "breadth first" delete,
but I suspect it's not a compelling enough use-case for Richard to invest
time on this.

Try indexing your FKs, and see what happens. --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Keith Medcalf

You have no index on trace(datasetid) ...
You have no index on metadata(datasetid) though the compound indexes in which datasetid is the first element *should* be sufficent.

.eqp on or .eqp full before issuing the delete command will tell you what the plan is.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <[hidden email]> On
>Behalf Of Thomas Kurz
>Sent: Thursday, 31 October, 2019 04:25
>To: SQLite mailing list <[hidden email]>
>Subject: Re: [sqlite] DELETE extremely slow
>
>Yes, please apologize, I indeed forgot to attach the table definitions:
>
>CREATE TABLE dataset (
>    id         INTEGER  PRIMARY KEY AUTOINCREMENT
>                        UNIQUE
>                        NOT NULL,
>    name       STRING   DEFAULT NULL
>                        COLLATE NOCASE,
>    is_latest  BOOLEAN  NOT NULL
>                        DEFAULT 1,
>    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
>);
>
>CREATE TABLE trace (
>    id         INTEGER  PRIMARY KEY AUTOINCREMENT
>                        UNIQUE
>                        NOT NULL,
>    name       STRING   DEFAULT NULL
>                        COLLATE NOCASE,
>    datasetid  INTEGER  REFERENCES dataset (id)
>                        NOT NULL,
>    quantityid INTEGER  REFERENCES quantity (id)
>                        NOT NULL,
>    stored     DATETIME DEFAULT NULL,
>    created_at DATETIME NOT NULL
>                        DEFAULT CURRENT_TIMESTAMP
>);
>
>CREATE INDEX trace_idx_01 ON trace (
>    quantityid
>);
>
>CREATE INDEX trace_idx_01 ON trace (
>    quantityid
>);
>
>CREATE TABLE item (
>    id        INTEGER PRIMARY KEY AUTOINCREMENT
>                      UNIQUE
>                      NOT NULL,
>    traceid   INTEGER REFERENCES trace (id)
>                      NOT NULL,
>    freq      BIGINT  NOT NULL,
>    value     REAL    NOT NULL,
>    noiseflag BOOLEAN DEFAULT NULL
>);
>
>CREATE INDEX item_idx_01 ON item (
>    traceid
>);
>
>CREATE TABLE metadata (
>    id        INTEGER PRIMARY KEY AUTOINCREMENT
>                      UNIQUE
>                      NOT NULL,
>    parameter STRING  NOT NULL
>                      COLLATE NOCASE,
>    value     STRING  NOT NULL
>                      COLLATE NOCASE,
>    datasetid INTEGER DEFAULT NULL
>                      REFERENCES dataset (id),
>    traceid   INTEGER DEFAULT NULL
>                      REFERENCES trace (id),
>    itemid    INTEGER DEFAULT NULL
>                      REFERENCES item (id)
>);
>
>CREATE INDEX metadata_idx_01 ON metadata (
>    parameter,
>    value,
>    datasetid,
>    traceid,
>    itemid
>);
>
>CREATE INDEX metadata_idx_02 ON metadata (
>    datasetid,
>    traceid
>);
>
>CREATE INDEX metadata_idx_03 ON metadata (
>    traceid
>);
>
>CREATE INDEX metadata_idx_04 ON metadata (
>    datasetid,
>    itemid
>);
>
>CREATE INDEX metadata_idx_05 ON metadata (
>    traceid,
>    itemid
>);
>
>CREATE INDEX metadata_idx_06 ON metadata (
>    itemid
>);
>
>CREATE INDEX metadata_idx_07 ON metadata (
>    datasetid,
>    parameter
>);
>
>CREATE INDEX metadata_idx_08 ON metadata (
>    traceid,
>    parameter
>);
>
>CREATE INDEX metadata_idx_09 ON metadata (
>    parameter,
>    traceid
>);
>
>CREATE INDEX metadata_idx_10 ON metadata (
>    parameter,
>    datasetid,
>    traceid,
>    itemid
>);
>
>CREATE TABLE quantity (
>    id         INTEGER             PRIMARY KEY AUTOINCREMENT
>                                   UNIQUE
>                                   NOT NULL,
>    name       STRING              NOT NULL,
>    unit       STRING              NOT NULL,
>    sumrule    [SMALLINT UNSIGNED] DEFAULT NULL,
>    created_at DATETIME            DEFAULT CURRENT_TIMESTAMP,
>    UNIQUE (
>        name,
>        unit,
>        sumrule
>    )
>    ON CONFLICT IGNORE
>);
>
>
>
>
>
>----- Original Message -----
>From: Dominique Devienne <[hidden email]>
>To: SQLite mailing list <[hidden email]>
>Sent: Thursday, October 31, 2019, 11:06:07
>Subject: [sqlite] DELETE extremely slow
>
>On Thu, Oct 31, 2019 at 10:57 AM Thomas Kurz <[hidden email]>
>wrote:
>
>> I'm using a database with 5 hierarchically strcutured tables using
>foreign
>> keys. The largest table contains about 230'000 entries. My problem is
>that
>> deleting in this database is extremely slow:
>
>> pragma foreign_keys=on;
>> pragma journal_mode=wal;
>> .timer on
>> delete from dataset;
>> --> Run Time: real 197993.218 user 53015.593750 sys 54056.546875
>
>> I experimentally imported the same data into a MariaDB database and
>tried
>> the same operation there (without paying attention to creating any
>indexes,
>> etc.). It takes only a few seconds there.
>
>> Is there something I can check or do to improve deletion speed?
>
>
>You're not describing the schema enough IMHO.
>Is dataset the "top-most" table, containing the "parent" rows all other
>tables references (directly or indirectly),
>with all FKs having ON DELETE CASCADE?
>
>If that's the case, without some kind of optimization in SQLite, when the
>first parent row is deleted,
>it triggers a cascade of deletes in "child" tables, looking for rows
>using
>the parent row. So if your FKs
>are not indexed for those column(s), that's a full table scan each
>time...
>That's "depth first".
>
>By analyzing the graph of FKs and their ON DELETE CASCADE state, and in
>the
>specific case of
>fully deleting the "main parent table", SQLite could decide switch to a
>smarter "breadth first" delete,
>but I suspect it's not a compelling enough use-case for Richard to invest
>time on this.
>
>Try indexing your FKs, and see what happens. --DD
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Simon Slavin-3
In reply to this post by Thomas Kurz
On 31 Oct 2019, at 9:51am, Thomas Kurz <[hidden email]> wrote:

> I'm using a database with 5 hierarchically strcutured tables using foreign keys. The largest table contains about 230'000 entries. My problem is that deleting in this database is extremely slow:

Keith found the answer: you don't have the indexes required to make your FOREIGN KEYs run quickly.  But I have a question:

If you DELETE FROM the child tables first, do you get fast or slow times ?

Thee way you're doing it involves a lot of steps as SQlite works its way through the parent table, deletes one row from that, and cascades through the other tables, looking for and deleting related rows from those.

SQLite has an optimization for DELETE FROM without a WHERE clause.  But I don't know if it's smart enough to realise that if you do that to a parent table, you can also do it to all child tables.  Or perhaps its not true and there might be child rows with no parent row, and SQLite is making sure it preserves them.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Jean-Luc Hainaut
In reply to this post by Thomas Kurz

Four observations that may (or may not) explain the problem:
- dataset.id declared "unique": useless since the primary key is unique
by definition but it may create an index (not checked). A PK is not
"more unique" if an additional "unique" constraint is declared..
- no "on delete <...>" on trace.datasetid; so, default "no action". Is
it what you want?
- no index on this foreign key, hence potential full scan to identify
children rows in "trace" (or absence thereof).
- index trace_idx_01 declared twice. Most often once is sufficient, even
for large tables.

Jean-Luc Hainaut

> Yes, please apologize, I indeed forgot to attach the table definitions:
>
> CREATE TABLE dataset (
>      id         INTEGER  PRIMARY KEY AUTOINCREMENT
>                          UNIQUE
>                          NOT NULL,
>      name       STRING   DEFAULT NULL
>                          COLLATE NOCASE,
>      is_latest  BOOLEAN  NOT NULL
>                          DEFAULT 1,
>      created_at DATETIME DEFAULT CURRENT_TIMESTAMP
> );
>
> CREATE TABLE trace (
>      id         INTEGER  PRIMARY KEY AUTOINCREMENT
>                          UNIQUE
>                          NOT NULL,
>      name       STRING   DEFAULT NULL
>                          COLLATE NOCASE,
>      datasetid  INTEGER  REFERENCES dataset (id)
>                          NOT NULL,
>      quantityid INTEGER  REFERENCES quantity (id)
>                          NOT NULL,
>      stored     DATETIME DEFAULT NULL,
>      created_at DATETIME NOT NULL
>                          DEFAULT CURRENT_TIMESTAMP
> );
>
> CREATE INDEX trace_idx_01 ON trace (
>      quantityid
> );
>
> CREATE INDEX trace_idx_01 ON trace (
>      quantityid
> );
>
> CREATE TABLE item (
>      id        INTEGER PRIMARY KEY AUTOINCREMENT
>                        UNIQUE
>                        NOT NULL,
>      traceid   INTEGER REFERENCES trace (id)
>                        NOT NULL,
>      freq      BIGINT  NOT NULL,
>      value     REAL    NOT NULL,
>      noiseflag BOOLEAN DEFAULT NULL
> );
>
> CREATE INDEX item_idx_01 ON item (
>      traceid
> );
>
> CREATE TABLE metadata (
>      id        INTEGER PRIMARY KEY AUTOINCREMENT
>                        UNIQUE
>                        NOT NULL,
>      parameter STRING  NOT NULL
>                        COLLATE NOCASE,
>      value     STRING  NOT NULL
>                        COLLATE NOCASE,
>      datasetid INTEGER DEFAULT NULL
>                        REFERENCES dataset (id),
>      traceid   INTEGER DEFAULT NULL
>                        REFERENCES trace (id),
>      itemid    INTEGER DEFAULT NULL
>                        REFERENCES item (id)
> );
>
> CREATE INDEX metadata_idx_01 ON metadata (
>      parameter,
>      value,
>      datasetid,
>      traceid,
>      itemid
> );
>
> CREATE INDEX metadata_idx_02 ON metadata (
>      datasetid,
>      traceid
> );
>
> CREATE INDEX metadata_idx_03 ON metadata (
>      traceid
> );
>
> CREATE INDEX metadata_idx_04 ON metadata (
>      datasetid,
>      itemid
> );
>
> CREATE INDEX metadata_idx_05 ON metadata (
>      traceid,
>      itemid
> );
>
> CREATE INDEX metadata_idx_06 ON metadata (
>      itemid
> );
>
> CREATE INDEX metadata_idx_07 ON metadata (
>      datasetid,
>      parameter
> );
>
> CREATE INDEX metadata_idx_08 ON metadata (
>      traceid,
>      parameter
> );
>
> CREATE INDEX metadata_idx_09 ON metadata (
>      parameter,
>      traceid
> );
>
> CREATE INDEX metadata_idx_10 ON metadata (
>      parameter,
>      datasetid,
>      traceid,
>      itemid
> );
>
> CREATE TABLE quantity (
>      id         INTEGER             PRIMARY KEY AUTOINCREMENT
>                                     UNIQUE
>                                     NOT NULL,
>      name       STRING              NOT NULL,
>      unit       STRING              NOT NULL,
>      sumrule    [SMALLINT UNSIGNED] DEFAULT NULL,
>      created_at DATETIME            DEFAULT CURRENT_TIMESTAMP,
>      UNIQUE (
>          name,
>          unit,
>          sumrule
>      )
>      ON CONFLICT IGNORE
> );
>
>
>
>
>
> ----- Original Message -----
> From: Dominique Devienne <[hidden email]>
> To: SQLite mailing list <[hidden email]>
> Sent: Thursday, October 31, 2019, 11:06:07
> Subject: [sqlite] DELETE extremely slow
>
> On Thu, Oct 31, 2019 at 10:57 AM Thomas Kurz <[hidden email]> wrote:
>
>> I'm using a database with 5 hierarchically strcutured tables using foreign
>> keys. The largest table contains about 230'000 entries. My problem is that
>> deleting in this database is extremely slow:
>> pragma foreign_keys=on;
>> pragma journal_mode=wal;
>> .timer on
>> delete from dataset;
>> --> Run Time: real 197993.218 user 53015.593750 sys 54056.546875
>> I experimentally imported the same data into a MariaDB database and tried
>> the same operation there (without paying attention to creating any indexes,
>> etc.). It takes only a few seconds there.
>> Is there something I can check or do to improve deletion speed?
>
> You're not describing the schema enough IMHO.
> Is dataset the "top-most" table, containing the "parent" rows all other
> tables references (directly or indirectly),
> with all FKs having ON DELETE CASCADE?
>
> If that's the case, without some kind of optimization in SQLite, when the
> first parent row is deleted,
> it triggers a cascade of deletes in "child" tables, looking for rows using
> the parent row. So if your FKs
> are not indexed for those column(s), that's a full table scan each time...
> That's "depth first".
>
> By analyzing the graph of FKs and their ON DELETE CASCADE state, and in the
> specific case of
> fully deleting the "main parent table", SQLite could decide switch to a
> smarter "breadth first" delete,
> but I suspect it's not a compelling enough use-case for Richard to invest
> time on this.
>
> Try indexing your FKs, and see what happens. --DD
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Thomas Kurz
In reply to this post by Simon Slavin-3
> Keith found the answer: you don't have the indexes required to make your FOREIGN KEYs run quickly.

Thanks, I will try that.

> If you DELETE FROM the child tables first, do you get fast or slow times ?

Yes, I already tried deleting from each table individually. It's slow everywhere.

> Thee way you're doing it involves a lot of steps as SQlite works its way through the parent table, deletes one row from that, and cascades through the other tables, looking for and deleting related rows from those.

Ok, I might have errors in my declarations, but SQLite seems to have problems as well, because MariaDB (without any explicit index defintion!) handles the same deletion within seconds...

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Simon Slavin-3
On 31 Oct 2019, at 11:09am, Thomas Kurz <[hidden email]> wrote:

> Yes, I already tried deleting from each table individually. It's slow everywhere.

Something is wrong.  If you did multiple commands like

DELETE FROM MyTable;

to your child tables, they should be fast.  Have you run an integrity check ?
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Warren Young
In reply to this post by Thomas Kurz
On Oct 31, 2019, at 3:51 AM, Thomas Kurz <[hidden email]> wrote:
>
> delete from dataset;

Is that command representative of actual use, or are you deleting all rows just for the purpose of benchmarking?

I ask because if you’re going to just delete all rows in a table, it’s usually faster to DROP TABLE and then CREATE TABLE.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Keith Medcalf
In reply to this post by Thomas Kurz

If you are deleting all the rows of a table, then you can simply truncate the table (SQLite will do this).  However if the table you are deleting all the rows from have dependent (child) tables *and* foreign key enforcement is turned on, then the rows have to be deleted on at a time so that the foreign key constraints can be checked, although if all the child tables have no rows you should be able to just truncate the parent.  SQLite does not perform that optimization and in the face of enforced foreign key constraints will always delete the rows one by each even if dependent (child) tables have no rows.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <[hidden email]> On
>Behalf Of Thomas Kurz
>Sent: Thursday, 31 October, 2019 05:10
>To: SQLite mailing list <[hidden email]>
>Subject: Re: [sqlite] DELETE extremely slow
>
>> Keith found the answer: you don't have the indexes required to make
>your FOREIGN KEYs run quickly.
>
>Thanks, I will try that.
>
>> If you DELETE FROM the child tables first, do you get fast or slow
>times ?
>
>Yes, I already tried deleting from each table individually. It's slow
>everywhere.
>
>> Thee way you're doing it involves a lot of steps as SQlite works its
>way through the parent table, deletes one row from that, and cascades
>through the other tables, looking for and deleting related rows from
>those.
>
>Ok, I might have errors in my declarations, but SQLite seems to have
>problems as well, because MariaDB (without any explicit index defintion!)
>handles the same deletion within seconds...
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

David Raymond
In reply to this post by Jean-Luc Hainaut
"Four observations that may (or may not) explain the problem:
- dataset.id declared "unique": useless since the primary key is unique
by definition but it may create an index (not checked). A PK is not
"more unique" if an additional "unique" constraint is declared."

Declaring it as both "primary key" and "unique" makes an extra (duplicate) index, yes. With the added uniqueness checking on the duplicate index as well.



SQLite version 3.29.0 2019-07-10 17:32:03
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.

sqlite> create table foo1 (pk integer primary key);

sqlite> create table foo2(pk integer primary key unique);

sqlite> select * from sqlite_master;
type|name|tbl_name|rootpage|sql
table|foo1|foo1|2|CREATE TABLE foo1 (pk integer primary key)
table|foo2|foo2|3|CREATE TABLE foo2(pk integer primary key unique)
index|sqlite_autoindex_foo2_1|foo2|4|

sqlite> explain insert into foo1 values (?);
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     13    0                    00  Start at 13
1     OpenWrite      0     2     0     1              00  root=2 iDb=0; foo1
2     Variable       1     1     0                    00  r[1]=parameter(1,)
3     NotNull        1     5     0                    00  if r[1]!=NULL goto 5
4     NewRowid       0     1     0                    00  r[1]=rowid
5     MustBeInt      1     0     0                    00
6     SoftNull       2     0     0                    00  r[2]=NULL
7     Noop           0     0     0                    00  uniqueness check for ROWID
8     NotExists      0     10    1                    00  intkey=r[1]
9     Halt           1555  2     0     foo1.pk        02
10    MakeRecord     2     1     3     D              00  r[3]=mkrec(r[2])
11    Insert         0     3     1     foo1           31  intkey=r[1] data=r[3]
12    Halt           0     0     0                    00
13    Transaction    0     1     2     0              01  usesStmtJournal=0
14    Goto           0     1     0                    00

sqlite> explain insert into foo2 values (?);
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     22    0                    00  Start at 22
1     OpenWrite      0     3     0     1              00  root=3 iDb=0; foo2
2     OpenWrite      1     4     0     k(1,)          00  root=4 iDb=0; sqlite_autoindex_foo2_1
3     Variable       1     1     0                    00  r[1]=parameter(1,)
4     NotNull        1     6     0                    00  if r[1]!=NULL goto 6
5     NewRowid       0     1     0                    00  r[1]=rowid
6     MustBeInt      1     0     0                    00
7     SoftNull       2     0     0                    00  r[2]=NULL
8     Noop           0     0     0                    00  uniqueness check for ROWID
9     NotExists      0     11    1                    00  intkey=r[1]
10    Halt           1555  2     0     foo2.pk        02
11    Affinity       2     1     0     D              00  affinity(r[2])
12    Noop           0     0     0                    00  uniqueness check for sqlite_autoindex_foo2_1
13    SCopy          1     4     0                    00  r[4]=r[1]; pk
14    IntCopy        1     5     0                    00  r[5]=r[1]; rowid
15    MakeRecord     4     2     3                    00  r[3]=mkrec(r[4..5]); for sqlite_autoindex_foo2_1
16    NoConflict     1     18    4     1              00  key=r[4]
17    Halt           2067  2     0     foo2.pk        02
18    MakeRecord     2     1     6                    00  r[6]=mkrec(r[2])
19    IdxInsert      1     3     4     1              10  key=r[3]
20    Insert         0     6     1     foo2           31  intkey=r[1] data=r[6]
21    Halt           0     0     0                    00
22    Transaction    0     1     2     0              01  usesStmtJournal=0
23    Goto           0     1     0                    00

sqlite>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Thomas Kurz
In reply to this post by Simon Slavin-3
> Something is wrong.  If you did multiple commands like
>
> DELETE FROM MyTable;
>
> to your child tables, they should be fast.  Have you run an integrity check ?

I created a new database now, added the missing index "trace(datasetid)" as suggested by Keith.

The result of "DELETE FROM dataset" is now 88 minutes, which of course is better than before where it took hours, but not nearly as quick as I'd expect...

@Warren:
> Is that command representative of actual use, or are you deleting all rows just for the purpose of benchmarking?

Usually I want to delete only several datasets, but not all. I left out the where-clause for simplification now.

@Keith:
> and in the face of enforced foreign key constraints will always delete the rows one by each even if dependent (child) tables have no rows.

Yes, but I'd expect that MariaDB has to do the same, but takes clearly less than 1 minute instead of 88 minutes... :confused:

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

David Raymond
It's not gonna account for 88 minutes, but out of curiosity is there a way to separately report the "delete and commit" time from the "dang this WAL needs to be checkpointed now" time?


-----Original Message-----
From: sqlite-users <[hidden email]> On Behalf Of Thomas Kurz
Sent: Thursday, October 31, 2019 11:10 AM
To: SQLite mailing list <[hidden email]>
Subject: Re: [sqlite] DELETE extremely slow

> Something is wrong.  If you did multiple commands like
>
> DELETE FROM MyTable;
>
> to your child tables, they should be fast.  Have you run an integrity check ?

I created a new database now, added the missing index "trace(datasetid)" as suggested by Keith.

The result of "DELETE FROM dataset" is now 88 minutes, which of course is better than before where it took hours, but not nearly as quick as I'd expect...

@Warren:
> Is that command representative of actual use, or are you deleting all rows just for the purpose of benchmarking?

Usually I want to delete only several datasets, but not all. I left out the where-clause for simplification now.

@Keith:
> and in the face of enforced foreign key constraints will always delete the rows one by each even if dependent (child) tables have no rows.

Yes, but I'd expect that MariaDB has to do the same, but takes clearly less than 1 minute instead of 88 minutes... :confused:

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Simon Slavin-3
In reply to this post by Thomas Kurz
On 31 Oct 2019, at 3:09pm, Thomas Kurz <[hidden email]> wrote:

> The result of "DELETE FROM dataset" is now 88 minutes

That still seems wrong.  I hope the list can help you track down what's going on.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Keith Medcalf
In reply to this post by Thomas Kurz
>pragma foreign_keys=on;
>pragma journal_mode=wal;
>.timer on
>delete from dataset;
>--> Run Time: real 197993.218 user 53015.593750 sys 54056.546875

Wow.  That is 14 hours each of System and User time and then and additional 24 hours of "waiting for something to happen" time.  Do you have memory to run this in?  Have you increased the sqlite cache size because that looks (to me) an awful lot like I/O thrashing ...

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Dominique Devienne
In reply to this post by Thomas Kurz
On Thu, Oct 31, 2019 at 4:10 PM Thomas Kurz <[hidden email]> wrote:

> Yes, but I'd expect that MariaDB has to do the same, but takes clearly
> less than 1 minute instead of 88 minutes... :confused:
>

Are we comparing apples to oranges here?

SQLite runs on the local machine. While MariaDB is client-server, so the
delete effectively runs on the server.
Is your MariaDB server local to the same machine you're running SQLite on?
Are both using the same local disks? (as opposed to networked and/or
different disks)

Still can't account for 88 minutes though.
You haven't shown the "explain query plan" Keith asked for, which would
help understand what's going on here.
See https://www.sqlite.org/eqp.html
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Keith Medcalf

On Thursday, 31 October, 2019 10:01, Dominique Devienne <[hidden email]> wrote:

>On Thu, Oct 31, 2019 at 4:10 PM Thomas Kurz <[hidden email]> wrote:

>> Yes, but I'd expect that MariaDB has to do the same, but takes clearly
>> less than 1 minute instead of 88 minutes... :confused:

>Are we comparing apples to oranges here?

>SQLite runs on the local machine. While MariaDB is client-server, so the
>delete effectively runs on the server.
>Is your MariaDB server local to the same machine you're running SQLite
>on?
>Are both using the same local disks? (as opposed to networked and/or
>different disks)

>Still can't account for 88 minutes though.

Well, 88 minutes is a lot better than 56 hours, of which 24 hours was "waiting" time.  How much of the 88 minutes is "waiting" time?  If the elapsed time is much greater than the sum of user+sys time then I would suspect it is still I/O thrashing (or the machine is really busy doing something else -- those original timings represent only a 57% process dispatch rate, which is pretty low).

>You haven't shown the "explain query plan" Keith asked for, which would
>help understand what's going on here.

Well, having added the missing index I doubt that there is anything else to be found there.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Simon Slavin-3
On 31 Oct 2019, at 4:29pm, Keith Medcalf <[hidden email]> wrote:

> If the elapsed time is much greater than the sum of user+sys time then I would suspect it is still I/O thrashing (or the machine is really busy doing something else -- those original timings represent only a 57% process dispatch rate, which is pretty low).

I'm betting it's a rotating hard disk with a slow spin speed, and most of the time is spent waiting for the disk to be in the right place.

As Dominique Devienne spotted, there is now some doubt about the comparative figure.  Was the MariaDB filespace hosted on this same hard disk ?

The original post had

> The largest table contains about 230'000 entries.

Five tables and eleven indexes.  Sixteen pages and sixteen page indexes.  Even if you think that all the required indexes for FOREIGN KEY lookup are now available, I think that 88 minutes is still longer than SQLite should take for anything but a slow disk.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Keith Medcalf

On Thursday, 31 October, 2019 10:52, Simon Slavin <[hidden email]> wrote:

>On 31 Oct 2019, at 4:29pm, Keith Medcalf <[hidden email]> wrote:

>> If the elapsed time is much greater than the sum of user+sys time then
>> I would suspect it is still I/O thrashing (or the machine is really busy
>> doing something else -- those original timings represent only a 57%
>> process dispatch rate, which is pretty low).

> I'm betting it's a rotating hard disk with a slow spin speed, and most of
> the time is spent waiting for the disk to be in the right place.

> As Dominique Devienne spotted, there is now some doubt about the
> comparative figure.  Was the MariaDB filespace hosted on this same hard
> disk ?

While nice, that is irrelevant.  MariaDB is a client/server database so it probably uses a database cache that is considerably larger by default than the SQLite default.  The SQLite default cache is 2 MB which is 500 pages at the default page size of 4K.

>The original post had

>> The largest table contains about 230'000 entries.

> Five tables and eleven indexes.  Sixteen pages and sixteen page indexes.
> Even if you think that all the required indexes for FOREIGN KEY lookup
> are now available, I think that 88 minutes is still longer than SQLite
> should take for anything but a slow disk.

That is why I asked about the cache_size.  If the cache_size has not been increased from the default then a large operation will be thrashing and spilling pages sto perform a large operation like that.  For I/O intensive operations the size of SQLite's page cache makes a huge difference, far more than an adequate filesystem or block cache in reducing I/O.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: DELETE extremely slow

Keith Medcalf
In reply to this post by Thomas Kurz

On Thursday, 31 October, 2019 03:51, Thomas Kurz <[hidden email]> wrote:

>I experimentally imported the same data into a MariaDB database and tried
>the same operation there (without paying attention to creating any
>indexes, etc.). It takes only a few seconds there.

According to the MariaDB reference manual, it does not "do anything" with references clauses on columns.  They are merely for entertainment purposes.  You have to use the table constraint syntax to declare enforceable foreign key constraints, which means you cannot use the same CREATE TABLE syntax for MariaDB as for SQLite3.

From https://mariadb.com/kb/en/library/create-table/

Note: MariaDB accepts the REFERENCES clause in ALTER TABLE and CREATE TABLE column definitions, but that syntax does nothing. MariaDB simply parses it without returning any error or warning, for compatibility with other DBMS's. Before MariaDB 10.2.1 this was also true for CHECK constraints. Only the syntax for indexes described below creates foreign keys.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
12