Missing "pushDownWhereTerms" optimisation on recursive CTE

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Missing "pushDownWhereTerms" optimisation on recursive CTE

Adrián Medraño Calvo
Dear SQLite,

The following SQL script shows a query selecting data from a recursive CTE and filtering it.  I expected the optimizer to apply the filter to the recursive CTE directly, and indeed the documentation of pushDownWhereTerms (src/select.c:3833) indicates this possibility when various conditions are satisfied.  As far as I can see, the conditions are satisfied, but the query is nonetheless not optimized.  This indicates a misunderstanding on my part, or an oversight in SQLite.

-- A table containing some numbers.
CREATE TABLE t (v INT PRIMARY KEY);
INSERT INTO t
VALUES (0), (1), (2), (3), (4), (5);

-- Recursive query relating a number a sequence of numbers from "t" equal or
-- greater than it.
EXPLAIN QUERY PLAN
WITH RECURSIVE
eqgrseq(initial, next) AS (SELECT v, v
                            FROM   t
                            UNION
                            SELECT eqgrseq.initial, t.v
                            FROM   eqgrseq
                            JOIN   t
                            ON     (t.v = eqgrseq.next + 1))
SELECT eqgrseq.initial, eqgrseq.next
FROM   eqgrseq
WHERE  eqgrseq.initial = :initial;
-- selectid,order,from,detail
-- 2,0,0,"SCAN TABLE t"
-- 3,0,0,"SCAN TABLE eqgrseq"
-- 3,1,1,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
-- 1,0,0,"COMPOUND SUBQUERIES 0 AND 0 USING TEMP B-TREE (UNION)"
-- 0,0,0,"SCAN SUBQUERY 1”

-- The same query with the WHERE condition manually placed in the recursive CTE's
-- initial clause.
EXPLAIN QUERY PLAN
WITH RECURSIVE
eqgrseq(initial, next) AS (SELECT v, v
                            FROM   t
                            WHERE v = :initial
                            UNION
                            SELECT eqgrseq.initial, t.v
                            FROM   eqgrseq
                            JOIN   t
                            ON     (t.v = eqgrseq.next + 1))
SELECT eqgrseq.initial, eqgrseq.next
FROM   eqgrseq
WHERE  eqgrseq.initial = :initial;
-- selectid,order,from,detail
-- 2,0,0,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
-- 3,0,0,"SCAN TABLE eqgrseq"
-- 3,1,1,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
-- 1,0,0,"COMPOUND SUBQUERIES 0 AND 0 USING TEMP B-TREE (UNION)"
-- 0,0,0,"SCAN SUBQUERY 1”

Note the query plan difference: the first scans the “t” table, therefore recurses for every value, while the second only recurses for the filtered ones.

In our application, the recursive CTE is hidden behind a view in order to abstract over the details; manually inserting the WHERE clause would not be possible while maintaining the view, as far as I can see.

Thank you,
Adrián.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

petern
Some observations.  It seems the WHERE pushdown optimization you cited only
applies to subqueries with existing WHERE clause.  In your example without
WHERE, the SELECT specifies the whole table as the left hand side of the
UNION.  Scanning the whole table is likely more efficient than using an
index to retrieve every row.  Do you have a better example of the problem?


[Another suggestion in the form of a question:  Is the more efficient UNION
ALL completely ruled out because of duplicates?]

Peter





On Thu, Mar 1, 2018 at 2:37 AM, Adrián Medraño Calvo <[hidden email]> wrote:

> Dear SQLite,
>
> The following SQL script shows a query selecting data from a recursive CTE
> and filtering it.  I expected the optimizer to apply the filter to the
> recursive CTE directly, and indeed the documentation of pushDownWhereTerms
> (src/select.c:3833) indicates this possibility when various conditions are
> satisfied.  As far as I can see, the conditions are satisfied, but the
> query is nonetheless not optimized.  This indicates a misunderstanding on
> my part, or an oversight in SQLite.
>
> -- A table containing some numbers.
> CREATE TABLE t (v INT PRIMARY KEY);
> INSERT INTO t
> VALUES (0), (1), (2), (3), (4), (5);
>
> -- Recursive query relating a number a sequence of numbers from "t" equal
> or
> -- greater than it.
> EXPLAIN QUERY PLAN
> WITH RECURSIVE
> eqgrseq(initial, next) AS (SELECT v, v
>                             FROM   t
>                             UNION
>                             SELECT eqgrseq.initial, t.v
>                             FROM   eqgrseq
>                             JOIN   t
>                             ON     (t.v = eqgrseq.next + 1))
> SELECT eqgrseq.initial, eqgrseq.next
> FROM   eqgrseq
> WHERE  eqgrseq.initial = :initial;
> -- selectid,order,from,detail
> -- 2,0,0,"SCAN TABLE t"
> -- 3,0,0,"SCAN TABLE eqgrseq"
> -- 3,1,1,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
> -- 1,0,0,"COMPOUND SUBQUERIES 0 AND 0 USING TEMP B-TREE (UNION)"
> -- 0,0,0,"SCAN SUBQUERY 1”
>
> -- The same query with the WHERE condition manually placed in the
> recursive CTE's
> -- initial clause.
> EXPLAIN QUERY PLAN
> WITH RECURSIVE
> eqgrseq(initial, next) AS (SELECT v, v
>                             FROM   t
>                             WHERE v = :initial
>                             UNION
>                             SELECT eqgrseq.initial, t.v
>                             FROM   eqgrseq
>                             JOIN   t
>                             ON     (t.v = eqgrseq.next + 1))
> SELECT eqgrseq.initial, eqgrseq.next
> FROM   eqgrseq
> WHERE  eqgrseq.initial = :initial;
> -- selectid,order,from,detail
> -- 2,0,0,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
> -- 3,0,0,"SCAN TABLE eqgrseq"
> -- 3,1,1,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
> -- 1,0,0,"COMPOUND SUBQUERIES 0 AND 0 USING TEMP B-TREE (UNION)"
> -- 0,0,0,"SCAN SUBQUERY 1”
>
> Note the query plan difference: the first scans the “t” table, therefore
> recurses for every value, while the second only recurses for the filtered
> ones.
>
> In our application, the recursive CTE is hidden behind a view in order to
> abstract over the details; manually inserting the WHERE clause would not be
> possible while maintaining the view, as far as I can see.
>
> Thank you,
> Adrián.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

Clemens Ladisch
In reply to this post by Adrián Medraño Calvo
Adrián Medraño Calvo wrote:
> The following SQL script shows a query selecting data from a recursive
> CTE and filtering it.  I expected the optimizer to apply the filter to
> the recursive CTE directly, and indeed the documentation of
> pushDownWhereTerms (src/select.c:3833) indicates this possibility when
> various conditions are satisfied.

Rule 22 of <http://www.sqlite.org/optoverview.html#flattening> forbids
subquery flattening in this case.  I suspect pushDownWhereTerms() is not
called at all.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

E.Pasma

> Adrián Medraño Calvo wrote:
>> The following SQL script shows a query selecting data from a  
>> recursive
>> CTE and filtering it.  I expected the optimizer to apply the filter  
>> to
>> the recursive CTE directly, and indeed the documentation of
>> pushDownWhereTerms (src/select.c:3833) indicates this possibility  
>> when
>> various conditions are satisfied.
>
Clemens Ladisch wrote:

> Rule 22 of <http://www.sqlite.org/optoverview.html#flattening> forbids
> subquery flattening in this case.  I suspect pushDownWhereTerms() is  
> not
> called at all.
>

Hello, "push down where terms" into a complex view can sometimes be  
achieved by correlation. The view/CTE must then be wrapped in a new  
query that is joinable via indexes. Your example is just perfect to  
show the trick. E. Pasma.


.eqp on
WITH eqgrseq(initial, next) AS (
     SELECT push.v, pull.v
     FROM   t push, t pull
     WHERE  pull.v IN (
         WITH RECURSIVE r AS (
             SELECT push.v
             UNION ALL
             SELECT t.v
             FROM   r
             JOIN   t
             ON     t.v = r.v + 1)
         SELECT v FROM r))
SELECT initial, next
FROM   eqgrseq
WHERE  initial = 1; --:initial;

Output:
--EQP-- 0,0,0,SEARCH TABLE t AS push USING COVERING INDEX  
sqlite_autoindex_t_1 (v=?)
--EQP-- 0,1,1,SEARCH TABLE t AS pull USING COVERING INDEX  
sqlite_autoindex_t_1 (v=?)
--EQP-- 0,0,0,EXECUTE CORRELATED LIST SUBQUERY 1
--EQP-- 4,0,0,SCAN TABLE r
--EQP-- 4,1,1,SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1  
(v=?)
--EQP-- 2,0,0,COMPOUND SUBQUERIES 0 AND 0 (UNION ALL)
--EQP-- 1,0,0,SCAN SUBQUERY 2
1|1
1|2
1|3
1|4
1|5


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

Dan Kennedy-4
In reply to this post by Adrián Medraño Calvo
On 03/01/2018 05:37 PM, Adrián Medraño Calvo wrote:

> Dear SQLite,
>
> The following SQL script shows a query selecting data from a recursive CTE and filtering it.  I expected the optimizer to apply the filter to the recursive CTE directly, and indeed the documentation of pushDownWhereTerms (src/select.c:3833) indicates this possibility when various conditions are satisfied.  As far as I can see, the conditions are satisfied, but the query is nonetheless not optimized.  This indicates a misunderstanding on my part, or an oversight in SQLite.
>
> -- A table containing some numbers.
> CREATE TABLE t (v INT PRIMARY KEY);
> INSERT INTO t
> VALUES (0), (1), (2), (3), (4), (5);
>
> -- Recursive query relating a number a sequence of numbers from "t" equal or
> -- greater than it.
> EXPLAIN QUERY PLAN
> WITH RECURSIVE
> eqgrseq(initial, next) AS (SELECT v, v
>    FROM   t
>    UNION
>    SELECT eqgrseq.initial, t.v
>    FROM   eqgrseq
>    JOIN   t
>    ON     (t.v = eqgrseq.next + 1))
> SELECT eqgrseq.initial, eqgrseq.next
> FROM   eqgrseq
> WHERE  eqgrseq.initial = :initial;


It's quite a special case really. You can push the WHERE term down only
because it refers to a column that is always copied without modification
from the initial result set into any recursive results. You could not
push down a term like:

   WHERE eqgrseq.next = :next:

Dan.






> -- selectid,order,from,detail
> -- 2,0,0,"SCAN TABLE t"
> -- 3,0,0,"SCAN TABLE eqgrseq"
> -- 3,1,1,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
> -- 1,0,0,"COMPOUND SUBQUERIES 0 AND 0 USING TEMP B-TREE (UNION)"
> -- 0,0,0,"SCAN SUBQUERY 1”
>
> -- The same query with the WHERE condition manually placed in the recursive CTE's
> -- initial clause.
> EXPLAIN QUERY PLAN
> WITH RECURSIVE
> eqgrseq(initial, next) AS (SELECT v, v
>    FROM   t
>    WHERE v = :initial
>    UNION
>    SELECT eqgrseq.initial, t.v
>    FROM   eqgrseq
>    JOIN   t
>    ON     (t.v = eqgrseq.next + 1))
> SELECT eqgrseq.initial, eqgrseq.next
> FROM   eqgrseq
> WHERE  eqgrseq.initial = :initial;
> -- selectid,order,from,detail
> -- 2,0,0,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
> -- 3,0,0,"SCAN TABLE eqgrseq"
> -- 3,1,1,"SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)"
> -- 1,0,0,"COMPOUND SUBQUERIES 0 AND 0 USING TEMP B-TREE (UNION)"
> -- 0,0,0,"SCAN SUBQUERY 1”
>
> Note the query plan difference: the first scans the “t” table, therefore recurses for every value, while the second only recurses for the filtered ones.
>
> In our application, the recursive CTE is hidden behind a view in order to abstract over the details; manually inserting the WHERE clause would not be possible while maintaining the view, as far as I can see.
>
> Thank you,
> Adrián.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

Adrián Medraño Calvo
In reply to this post by Clemens Ladisch
Hello Clemens,

thank you for your answer.

> On 2. Mar 2018, at 11:19, Clemens Ladisch <[hidden email]> wrote:
>
> Rule 22 of <http://www.sqlite.org/optoverview.html#flattening> forbids
> subquery flattening in this case.  I suspect pushDownWhereTerms() is not
> called at all.

Although this is definitely over my level of understanding of SQLite
optimization, I can affirm that `pushDownWhereTerms` is invoked exactly once
for the above query (checked with a debugger).  It aborts as soon as
it detects that the subquery is recursive, and rightly so (see Dan’s answer).

Thank you,
Adrián.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

Adrián Medraño Calvo
In reply to this post by E.Pasma
On 2. Mar 2018, at 15:55, E.Pasma <[hidden email]> wrote:

>
>
>> Adrián Medraño Calvo wrote:
>>> The following SQL script shows a query selecting data from a  
>>> recursive
>>> CTE and filtering it.  I expected the optimizer to apply the filter  
>>> to
>>> the recursive CTE directly, and indeed the documentation of
>>> pushDownWhereTerms (src/select.c:3833) indicates this possibility  
>>> when
>>> various conditions are satisfied.
>>
> Clemens Ladisch wrote:
>
>> Rule 22 of <http://www.sqlite.org/optoverview.html#flattening> forbids
>> subquery flattening in this case.  I suspect pushDownWhereTerms() is  
>> not
>> called at all.
>>
>
> Hello, "push down where terms" into a complex view can sometimes be  
> achieved by correlation. The view/CTE must then be wrapped in a new  
> query that is joinable via indexes. Your example is just perfect to  
> show the trick. E. Pasma.
>
>
> .eqp on
> WITH eqgrseq(initial, next) AS (
>     SELECT push.v, pull.v
>     FROM   t push, t pull
>     WHERE  pull.v IN (
>         WITH RECURSIVE r AS (
>             SELECT push.v
>             UNION ALL
>             SELECT t.v
>             FROM   r
>             JOIN   t
>             ON     t.v = r.v + 1)
>         SELECT v FROM r))
> SELECT initial, next
> FROM   eqgrseq
> WHERE  initial = 1; --:initial;
>
> Output:
> --EQP-- 0,0,0,SEARCH TABLE t AS push USING COVERING INDEX  
> sqlite_autoindex_t_1 (v=?)
> --EQP-- 0,1,1,SEARCH TABLE t AS pull USING COVERING INDEX  
> sqlite_autoindex_t_1 (v=?)
> --EQP-- 0,0,0,EXECUTE CORRELATED LIST SUBQUERY 1
> --EQP-- 4,0,0,SCAN TABLE r
> --EQP-- 4,1,1,SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1  
> (v=?)
> --EQP-- 2,0,0,COMPOUND SUBQUERIES 0 AND 0 (UNION ALL)
> --EQP-- 1,0,0,SCAN SUBQUERY 2
> 1|1
> 1|2
> 1|3
> 1|4
> 1|5

Dear E. Pasma,

wow, that is remarkable!

The way I interpret it, by wrapping it in this way we extract
the constant column out of the recursive CTE, where the WHERE
clause shall not be pushed (as I learned from Dan), into the
outer query, where it can be pushed.  The trade-off would be
that, due to it being a correlated subquery, the recursive
query would be rerun for each filtered value.  (I wonder
whether the performance is very different from what one gets
by manually inserting the WHERE clause in the base case of the
recursive CTE.)  I see that it could also lead to duplicate
results (that is, with queries different than the example);
I’d say that recursive CTEs using UNION should change to use
UNION ALL plus the DISTINCT keyword on the outer query.

Thank you for your answer.  Sincerely,
Adrián.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

Adrián Medraño Calvo
In reply to this post by petern
Hello Peter,

thank you for your response.

>
> On 2. Mar 2018, at 06:30, petern <[hidden email]> wrote:
>
> Some observations.  It seems the WHERE pushdown optimization you cited only
> applies to subqueries with existing WHERE clause.  In your example without
> WHERE, the SELECT specifies the whole table as the left hand side of the
> UNION.  Scanning the whole table is likely more efficient than using an
> index to retrieve every row.  Do you have a better example of the problem?

I’m not sure that’s the case.  For example:

sqlite> .eqp on;
sqlite> WITH RECURSIVE
   ...> eqgrseq(initial, next) AS (SELECT v, v
   ...>                             FROM   t
   ...>                             WHERE 1 = 1
   ...>                             UNION
   ...>                             SELECT eqgrseq.initial, t.v
   ...>                             FROM   eqgrseq
   ...>                             JOIN   t
   ...>                             ON    (t.v = eqgrseq.next + 1))
   ...> SELECT eqgrseq.initial, eqgrseq.next
   ...> FROM   eqgrseq
   ...> WHERE  eqgrseq.initial = 1;
--EQP-- 2,0,0,SCAN TABLE t
--EQP-- 3,0,0,SCAN TABLE eqgrseq
--EQP-- 3,1,1,SEARCH TABLE t USING COVERING INDEX sqlite_autoindex_t_1 (v=?)
--EQP-- 1,0,0,COMPOUND SUBQUERIES 0 AND 0 USING TEMP B-TREE (UNION)
--EQP-- 0,0,0,SCAN SUBQUERY 1

The where expression is not pushed to the non-recursive case either.

> [Another suggestion in the form of a question:  Is the more efficient UNION
> ALL completely ruled out because of duplicates?]


You are right, it would make this query more performant without
changing its meaning.

Sincerely,
Adrián.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

Adrián Medraño Calvo
In reply to this post by Dan Kennedy-4

> On 2. Mar 2018, at 21:39, Dan Kennedy <[hidden email]> wrote:
>
> On 03/01/2018 05:37 PM, Adrián Medraño Calvo wrote:
>> Dear SQLite,
>>
>> The following SQL script shows a query selecting data from a recursive CTE and filtering it.  I expected the optimizer to apply the filter to the recursive CTE directly, and indeed the documentation of pushDownWhereTerms (src/select.c:3833) indicates this possibility when various conditions are satisfied.  As far as I can see, the conditions are satisfied, but the query is nonetheless not optimized.  This indicates a misunderstanding on my part, or an oversight in SQLite.
>>
>> -- A table containing some numbers.
>> CREATE TABLE t (v INT PRIMARY KEY);
>> INSERT INTO t
>> VALUES (0), (1), (2), (3), (4), (5);
>>
>> -- Recursive query relating a number a sequence of numbers from "t" equal or
>> -- greater than it.
>> EXPLAIN QUERY PLAN
>> WITH RECURSIVE
>> eqgrseq(initial, next) AS (SELECT v, v
>>    FROM   t
>>    UNION
>>    SELECT eqgrseq.initial, t.v
>>    FROM   eqgrseq
>>    JOIN   t
>>    ON     (t.v = eqgrseq.next + 1))
>> SELECT eqgrseq.initial, eqgrseq.next
>> FROM   eqgrseq
>> WHERE  eqgrseq.initial = :initial;
>
>
> It's quite a special case really. You can push the WHERE term down only
> because it refers to a column that is always copied without modification
> from the initial result set into any recursive results. You could not
> push down a term like:
>
>  WHERE eqgrseq.next = :next:
>
> Dan.

Indeed… pushing the WHERE clause would absolutely be wrong, as it would
prevent generating part of the results.  I was blind to that, thank you
for pointing it out.

Best regards,
Adrián.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Missing "pushDownWhereTerms" optimisation on recursive CTE

E.Pasma
In reply to this post by Adrián Medraño Calvo
Hello Adrián, as you say

  (I wonder whether the performance is very different from what one  
gets by manually inserting the WHERE clause in the base case of the  
recursive CTE.)

I wonder too. Still the trick is meant to make a view (without  
manually inserted predicates inside)

Thanks for the reply. E. Pasma

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users