Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

Paul
I've traced this issue down to the simplest test case:

CREATE TABLE IF NOT EXISTS foo
(
 id  INTEGER,
 baz INTEGER,
 PRIMARY KEY(id)
);

CREATE INDEX IF NOT EXISTS baz_foo_idx ON foo(baz, id);

CREATE TABLE IF NOT EXISTS bar
(
 foo INTEGER,
 PRIMARY KEY(foo),
 FOREIGN KEY(foo) REFERENCES foo(id) ON DELETE CASCADE
);

WITH RECURSIVE
 cnt(x, y) AS (VALUES(1, 1) UNION ALL SELECT x + 1, x + 1 FROM cnt WHERE x < 200000)
 INSERT INTO foo(id, baz) SELECT x, y FROM cnt;

WITH RECURSIVE
 cnt(x) AS (VALUES(1) UNION ALL SELECT x + 3 FROM cnt WHERE x < 50000)
 INSERT INTO bar SELECT x FROM cnt;

SELECT id FROM foo WHERE baz = 99999 AND id IN (SELECT foo FROM bar) LIMIT 0, 10;


This query takes too much time:  

 SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;


It seems like execution time is a function of baz:

sqlite> .timer on
sqlite> SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;
id        
----------
10000    
Run Time: real 14.839 user 14.836000 sys 0.000000
sqlite> SELECT id FROM foo WHERE baz = 1000 AND id IN (SELECT foo FROM bar) LIMIT 1;
id        
----------
1000      
Run Time: real 1.577 user 1.576000 sys 0.000000
sqlite> SELECT id FROM foo WHERE baz = 100 AND id IN (SELECT foo FROM bar) LIMIT 1;
id        
----------
100      
Run Time: real 0.232 user 0.232000 sys 0.000000
sqlite> SELECT id FROM foo WHERE baz = 10 AND id IN (SELECT foo FROM bar) LIMIT 1;
id        
----------
10        
Run Time: real 0.036 user 0.036000 sys 0.000000
sqlite> SELECT id FROM foo WHERE baz = 1 AND id IN (SELECT foo FROM bar) LIMIT 1;
id        
----------
1        
Run Time: real 0.001 user 0.000000 sys 0.000000
 
 
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

Paul

To add to that, EXPLAIN QUERY PLAN shows that covering index will be used:

sqlite> EXPLAIN QUERY PLAN SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;
selectid    order       from        detail                                                                        
----------  ----------  ----------  ------------------------------------------------------------------------------
0           0           0           SEARCH TABLE foo USING COVERING INDEX baz_foo_idx (baz=? AND id=? AND rowid=??)


It is not clear to me, what query algorithm is doing. It seems like it iterates through bar and for each row of bar it performs unindexed cross-search in the foo.

However, according to EXPLAIN, it should iterate over the baz_foo_idx index and perform indexed cross-searches in the bar.


 

> I've traced this issue down to the simplest test case:
>
> CREATE TABLE IF NOT EXISTS foo
> (
>  id  INTEGER,
>  baz INTEGER,
>  PRIMARY KEY(id)
> );
>
> CREATE INDEX IF NOT EXISTS baz_foo_idx ON foo(baz, id);
>
> CREATE TABLE IF NOT EXISTS bar
> (
>  foo INTEGER,
>  PRIMARY KEY(foo),
>  FOREIGN KEY(foo) REFERENCES foo(id) ON DELETE CASCADE
> );
>
> WITH RECURSIVE
>  cnt(x, y) AS (VALUES(1, 1) UNION ALL SELECT x + 1, x + 1 FROM cnt WHERE x < 200000)
>  INSERT INTO foo(id, baz) SELECT x, y FROM cnt;
>
> WITH RECURSIVE
>  cnt(x) AS (VALUES(1) UNION ALL SELECT x + 3 FROM cnt WHERE x < 50000)
>  INSERT INTO bar SELECT x FROM cnt;
>
> SELECT id FROM foo WHERE baz = 99999 AND id IN (SELECT foo FROM bar) LIMIT 0, 10;
>
>
> This query takes too much time:  
>
>  SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;
>
>
> It seems like execution time is a function of baz:
>
> sqlite> .timer on
> sqlite> SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;
> id        
> ----------
> 10000    
> Run Time: real 14.839 user 14.836000 sys 0.000000
> sqlite> SELECT id FROM foo WHERE baz = 1000 AND id IN (SELECT foo FROM bar) LIMIT 1;
> id        
> ----------
> 1000      
> Run Time: real 1.577 user 1.576000 sys 0.000000
> sqlite> SELECT id FROM foo WHERE baz = 100 AND id IN (SELECT foo FROM bar) LIMIT 1;
> id        
> ----------
> 100      
> Run Time: real 0.232 user 0.232000 sys 0.000000
> sqlite> SELECT id FROM foo WHERE baz = 10 AND id IN (SELECT foo FROM bar) LIMIT 1;
> id        
> ----------
> 10        
> Run Time: real 0.036 user 0.036000 sys 0.000000
> sqlite> SELECT id FROM foo WHERE baz = 1 AND id IN (SELECT foo FROM bar) LIMIT 1;
> id        
> ----------
> 1        
> Run Time: real 0.001 user 0.000000 sys 0.000000
>  
>  
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
 
 
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

Clemens Ladisch
In reply to this post by Paul
Paul wrote:

> I've traced this issue down to the simplest test case:
>
> CREATE TABLE IF NOT EXISTS foo
> (
>  id  INTEGER,
>  baz INTEGER,
>  PRIMARY KEY(id)
> );
>
> CREATE INDEX IF NOT EXISTS baz_foo_idx ON foo(baz, id);
>
> CREATE TABLE IF NOT EXISTS bar
> (
>  foo INTEGER,
>  PRIMARY KEY(foo),
>  FOREIGN KEY(foo) REFERENCES foo(id) ON DELETE CASCADE
> );
>
> WITH RECURSIVE
>  cnt(x, y) AS (VALUES(1, 1) UNION ALL SELECT x + 1, x + 1 FROM cnt WHERE x < 200000)
>  INSERT INTO foo(id, baz) SELECT x, y FROM cnt;
>
> WITH RECURSIVE
>  cnt(x) AS (VALUES(1) UNION ALL SELECT x + 3 FROM cnt WHERE x < 50000)
>  INSERT INTO bar SELECT x FROM cnt;
>
> This query takes too much time:
>
>  SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;

EXPLAIN SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     24    0                    00  Start at 24
1     Integer        1     1     0                    00  r[1]=1; LIMIT counter
2     OpenRead       2     3     0     k(3,,,)        02  root=3 iDb=0; baz_foo_idx
3     Integer        10000  2     0                    00  r[2]=10000
4     Once           0     6     0                    00
5     OpenRead       3     4     0     1              00  root=4 iDb=0; bar
6     Rewind         3     22    0                    00
7       Rowid          3     3     0                    00  r[3]=rowid
8       IsNull         3     21    0                    00  if r[3]==NULL goto 21
9       Once           1     11    0                    00
10      OpenRead       4     4     0     1              00  root=4 iDb=0; bar
11      Rewind         4     21    0                    00
12        Rowid          4     4     0                    00  r[4]=rowid
13        IsNull         4     20    0                    00  if r[4]==NULL goto 20
14        SeekGE         2     20    2     3              00  key=r[2..4]
15          IdxGT          2     20    2     3              00  key=r[2..4]
16          IdxRowid       2     5     0                    00  r[5]=rowid
17          ResultRow      5     1     0                    00  output=r[5]
18          DecrJumpZero   1     22    0                    00  if (--r[1])==0 goto 22
19        Next           2     15    0                    00
20      NextIfOpen     4     12    0                    00
21    NextIfOpen     3     7     0                    00
22    Close          2     0     0                    00
23    Halt           0     0     0                    00
24    Transaction    0     0     3     0              01  usesStmtJournal=0
25    TableLock      0     2     0     foo            00  iDb=0 root=2 write=0
26    TableLock      0     4     0     bar            00  iDb=0 root=4 write=0
27    Goto           0     1     0                    00

If I've understood this correctly, it's the equivalent of this pseudocode:

  cursor c3 = scan bar
  for each row in c3:
      cursor c4 = scan bar
      for each row in c4:
          cursor c2 = search (10000, c3.rowid, c4.rowid) in baz_foo_idx
          for each row in c2:
              output c2.rowid
              stop

This looks like a bug.

As a workaround, drop "id" from the index (the rowid is always part of
the index anyway).


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

Paul

> Paul wrote:
> > I've traced this issue down to the simplest test case:
> >
> > CREATE TABLE IF NOT EXISTS foo
> > (
> >  id  INTEGER,
> >  baz INTEGER,
> >  PRIMARY KEY(id)
> > );
> >
> > CREATE INDEX IF NOT EXISTS baz_foo_idx ON foo(baz, id);
> >
> > CREATE TABLE IF NOT EXISTS bar
> > (
> >  foo INTEGER,
> >  PRIMARY KEY(foo),
> >  FOREIGN KEY(foo) REFERENCES foo(id) ON DELETE CASCADE
> > );
> >
> > WITH RECURSIVE
> >  cnt(x, y) AS (VALUES(1, 1) UNION ALL SELECT x + 1, x + 1 FROM cnt WHERE x < 200000)
> >  INSERT INTO foo(id, baz) SELECT x, y FROM cnt;
> >
> > WITH RECURSIVE
> >  cnt(x) AS (VALUES(1) UNION ALL SELECT x + 3 FROM cnt WHERE x < 50000)
> >  INSERT INTO bar SELECT x FROM cnt;
> >
> > This query takes too much time:
> >
> >  SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;
>
> EXPLAIN SELECT id FROM foo WHERE baz = 10000 AND id IN (SELECT foo FROM bar) LIMIT 1;
> addr  opcode         p1    p2    p3    p4             p5  comment
> ----  -------------  ----  ----  ----  -------------  --  -------------
> 0     Init           0     24    0                    00  Start at 24
> 1     Integer        1     1     0                    00  r[1]=1; LIMIT counter
> 2     OpenRead       2     3     0     k(3,,,)        02  root=3 iDb=0; baz_foo_idx
> 3     Integer        10000  2     0                    00  r[2]=10000
> 4     Once           0     6     0                    00
> 5     OpenRead       3     4     0     1              00  root=4 iDb=0; bar
> 6     Rewind         3     22    0                    00
> 7       Rowid          3     3     0                    00  r[3]=rowid
> 8       IsNull         3     21    0                    00  if r[3]==NULL goto 21
> 9       Once           1     11    0                    00
> 10      OpenRead       4     4     0     1              00  root=4 iDb=0; bar
> 11      Rewind         4     21    0                    00
> 12        Rowid          4     4     0                    00  r[4]=rowid
> 13        IsNull         4     20    0                    00  if r[4]==NULL goto 20
> 14        SeekGE         2     20    2     3              00  key=r[2..4]
> 15          IdxGT          2     20    2     3              00  key=r[2..4]
> 16          IdxRowid       2     5     0                    00  r[5]=rowid
> 17          ResultRow      5     1     0                    00  output=r[5]
> 18          DecrJumpZero   1     22    0                    00  if (--r[1])==0 goto 22
> 19        Next           2     15    0                    00
> 20      NextIfOpen     4     12    0                    00
> 21    NextIfOpen     3     7     0                    00
> 22    Close          2     0     0                    00
> 23    Halt           0     0     0                    00
> 24    Transaction    0     0     3     0              01  usesStmtJournal=0
> 25    TableLock      0     2     0     foo            00  iDb=0 root=2 write=0
> 26    TableLock      0     4     0     bar            00  iDb=0 root=4 write=0
> 27    Goto           0     1     0                    00
>
> If I've understood this correctly, it's the equivalent of this pseudocode:
>
>   cursor c3 = scan bar
>   for each row in c3:
>       cursor c4 = scan bar
>       for each row in c4:
>           cursor c2 = search (10000, c3.rowid, c4.rowid) in baz_foo_idx
>           for each row in c2:
>               output c2.rowid
>               stop
>
> This looks like a bug.
>
> As a workaround, drop "id" from the index (the rowid is always part of
> the index anyway).

Thank you for the information!

We'll stick with the old version for now, until the bug is fixed, since it's hard to change database structure since there are millions of copies. Also we use 'ORDER BY baz DESC, id DESC' and I'm not sure how will it work out in case of index on the single baz field.
 
 
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

Richard Hipp-3
In reply to this post by Clemens Ladisch
On 10/5/16, Clemens Ladisch <[hidden email]> wrote:
>               stop
>
> This looks like a bug.
>

I think it might be fixed on trunk.  I was just trying to bisect...

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

Richard Hipp-3
On 10/5/16, Richard Hipp <[hidden email]> wrote:
> On 10/5/16, Clemens Ladisch <[hidden email]> wrote:
>>               stop
>>
>> This looks like a bug.
>>
>
> I think it might be fixed on trunk.  I was just trying to bisect...

I think this may be a repeat of the problem described by ticket
https://sqlite.org/src/info/0eab1ac759 and fixed on 2016-09-16 by
check-in https://sqlite.org/src/info/a92aee5520cfaf85

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Performance degradation in query planner in SQLite 3.14.2 (vs SQLite 3.10.2)

Paul
Yes, fixed in pre-release snapshot 201610041220.


Thank you.

 

> On 10/5/16, Richard Hipp  wrote:
> > On 10/5/16, Clemens Ladisch  wrote:
> >>               stop
> >>
> >> This looks like a bug.
> >>
> >
> > I think it might be fixed on trunk.  I was just trying to bisect...
>
> I think this may be a repeat of the problem described by ticket
> https://sqlite.org/src/info/0eab1ac759 and fixed on 2016-09-16 by
> check-in https://sqlite.org/src/info/a92aee5520cfaf85
>
> --
> D. Richard Hipp
> [hidden email]
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
 
 
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users