Query regression with virtual tables

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Query regression with virtual tables

Hick Gunter
On upgrading from 3.7.14 to 3.24 (at last) I found a very detrimental change in the query implementation for a certain type of query:

SELECT <...> FROM <virtual_table> WHERE a IN (<list a>) AND b IN (<list b>) AND c = <value c> AND timestamp between <beginning> AND <end> ORDER BY timestamp DESC LIMIT <pagesize>,<offset>;

In 3.7.14 the xBestIndex function was called with 3 constraints {(c,'='),(timestamp, '<='),(timestamp,'>=)}. The function sets argvIndex values and OMIT flags for all three constraints, an index number and the orderByConsumed flag.

The created bytecode is

- create & populate ephemeral table a
- create & populate ephemeral table b
- loop over virtual table
-- check field a value IN ephemeral table a
-- check field b value IN ehpemeral table b
-next

In 3.24 the xBestIndex fuction is called with two additional constraints {(a,'='),(b.'=')}. The function sets argvIndex values and OMIT flags for all five constraints, an index number and the orderByConsumed flag.

The created bytecode is

- create & populate ephemeral table a
- create & populate ephemeral table b
- loop over ephemeral table a
-- loop over ephemeral table b
--- loop over virtual table
- sort result set
- return LIMIT rows


The virtual table is actually a partitioned table that will search only the partitions that cover the selected range of timestamps and implements merge logic. It will therefore accept and pass on addtional constraints to the subquery against the partition members. Unfortunately, the NGQP seems to be asking about a join with ephemeral tables, which precludes using CROSS JOIN to force a performant query plan that returns correctly ordered result rows.

Of course I could rewrite this as a sequence of statements approximately like:

BEGIN;
CREATE TEMP TABLE a_values AS VALUES()...
CREATE TEMP TABLE b_values AS VALUES()...
SELECT <...> FROM <virtual_table> CROSS JOIN a_values ON (a) CROSS JOIN b_values ON (b) WHERE c = <value c> AND timestamp between <beginning> AND <end> ORDER BY timestamp DESC LIMIT <pagesize>,<offset>;
DROP TABLE a_values;
DROP TABLE b_values;
COMMIT;

But this results in two nested loops (even if an index is added on teach temp table). How can I get the ephemeral table lookups back?


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Query regression with virtual tables

Hick Gunter
In the meantime, I have found that restricting usable constraints to the fields present in the selected virtual index nearly restores the orginal query plan. Field c is now handled by SQlite at the outer level instead of in the query issued to member tables, forcing more records to be processed by the partition software.

The created bytecode is now

- create & populate ephemeral table a
- create & populate ephemeral table b
- loop over virtual table
-- check field a value IN ephemeral table a
-- check field b value IN ehpemeral table b
-- check field c value
 -next

-----Urspr√ľngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Hick Gunter
Gesendet: Montag, 12. November 2018 11:52
An: 'SQLite mailing list' <[hidden email]>
Betreff: [EXTERNAL] [sqlite] Query regression with virtual tables

On upgrading from 3.7.14 to 3.24 (at last) I found a very detrimental change in the query implementation for a certain type of query:

SELECT <...> FROM <virtual_table> WHERE a IN (<list a>) AND b IN (<list b>) AND c = <value c> AND timestamp between <beginning> AND <end> ORDER BY timestamp DESC LIMIT <pagesize>,<offset>;

In 3.7.14 the xBestIndex function was called with 3 constraints {(c,'='),(timestamp, '<='),(timestamp,'>=)}. The function sets argvIndex values and OMIT flags for all three constraints, an index number and the orderByConsumed flag.

The created bytecode is

- create & populate ephemeral table a
- create & populate ephemeral table b
- loop over virtual table
-- check field a value IN ephemeral table a
-- check field b value IN ehpemeral table b
 -next

In 3.24 the xBestIndex fuction is called with two additional constraints {(a,'='),(b.'=')}. The function sets argvIndex values and OMIT flags for all five constraints, an index number and the orderByConsumed flag.

The created bytecode is

- create & populate ephemeral table a
- create & populate ephemeral table b
- loop over ephemeral table a
-- loop over ephemeral table b
--- loop over virtual table
- sort result set
- return LIMIT rows


The virtual table is actually a partitioned table that will search only the partitions that cover the selected range of timestamps and implements merge logic. It will therefore accept and pass on addtional constraints to the subquery against the partition members. Unfortunately, the NGQP seems to be asking about a join with ephemeral tables, which precludes using CROSS JOIN to force a performant query plan that returns correctly ordered result rows.

Of course I could rewrite this as a sequence of statements approximately like:

BEGIN;
CREATE TEMP TABLE a_values AS VALUES()...
CREATE TEMP TABLE b_values AS VALUES()...
SELECT <...> FROM <virtual_table> CROSS JOIN a_values ON (a) CROSS JOIN b_values ON (b) WHERE c = <value c> AND timestamp between <beginning> AND <end> ORDER BY timestamp DESC LIMIT <pagesize>,<offset>; DROP TABLE a_values; DROP TABLE b_values; COMMIT;

But this results in two nested loops (even if an index is added on teach temp table). How can I get the ephemeral table lookups back?


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users