Query Planning Knowledge

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Query Planning Knowledge

Andy Bennett
Hi,

I'm having some problems understanding what the query planner is doing and
how to convince it to work in the way I want.

Sorry that this is such a long and involved eMail. I've tried to describe
my problem and show the steps I have taken in debugging it. I have made
some effort to present a minimal and basic illustration of the problem.
Either I'm in error and glad to be assisted, or there's some opportunities
that the query planner is not taking. Hopefully my detail helps someone who
knows more than me get to the bottom of the issue!


My GUI montior is reporting SQLite 3.8.11.1 and my app is using 3.17.0. I
get the same result with both.

I've included more complete details of my schema at the bottom of this
message but only introduced the bits I'm talking about as I go through.


I'm aiming to write a query that's good for feeding an iterator and for
pagination. i.e. I don't want a query that needs to compute the whole
result up-front, before returning the first row. I want something where I
can execute the query, consume a few rows for a while, close it and then
later start a new query with a parameter based on how far I got through and
then continue where I left off.

The structure of my dataset is such that once rows are written to the
database they are not changed, so there exist a set of parameters where I
can do this.


The data is an append only log. The log consists of entries (in the entrys
table) and they have monotonically increasing "entry-number"s. The table
can store more than one log so there's a "log-id" as well. These entries
have some data of their own and each point to zero or more items which are
stored in the "items" table. Thus there's a join table "entry-items" which
consists of 3 columns, all parts of the primary key, "log-d",
"entry-number" and "item-id".

Each entry has a "key". As entries are added to the log, the "latest" entry
for a particular key describes the current state of that key. The set of
keys and their current state is called "records" and they only exist as a
query on the tables. It is this query that I'm trying to plan well.

The idea is for the app user to be able to ask for a cursor on the records
and then consume them a page at a time. I'm not particularly bothered which
order they keys come out in but it has to be deterministic so that I can
restart the query again. It's nice if they come out in "key" order but
"entry-number" order will do too. The main property I want is to be able to
pass this app-defined cursor over my API boundary without holding open a
read transaction on SQLite.

I've started with a log that isn't huge but needs some attention to make it
work efficiently. This log has 54,272 entries and 42,737 records at the
latest version. I expect to have logs with a few tens of million active
records but probably nothing larger than 100 million. Not exactly Big Data
;-) but worthy of indexing.


So, here's my query, annotated with a few numbered points so that I can
talk about things. The bits that are commented out get commented in (and
out again!) as I discuss my problems with the query plan.

-----
explain query plan
            SELECT
            "entrys"."log-id"       AS "log-id",
            "entrys"."entry-number" AS "entry-number",
            "entrys"."region"       AS "region",
            "entrys"."key"          AS "key",
            "entrys"."timestamp"    AS "timestamp",
            "entry-items"."item-id" AS "item-id",
            "items"."blob"          AS "blob"

            FROM
            (SELECT -- ** 1 **
              MAX("entry-number") AS "entry-number",
              "key"
              FROM "entrys"
              WHERE
              "log-id" = 51 AND
              "region" = "user" AND
              "entry-number" <= 54272
-- AND key > "100009" -- ** 2 **
-- AND key > "145943"
              GROUP BY "key"
                 -- order by key -- ** 3 **
                 ) AS "specific-entrys"

            LEFT OUTER JOIN "entrys"
            ON
            "specific-entrys"."entry-number" = "entrys"."entry-number"
AND "specific-entrys"."key" = "entrys"."key" -- ** 4 **
AND "user" = "entrys"."region"

            LEFT OUTER JOIN "entry-items"
            ON
            51 = "entry-items"."log-id" AND -- ** 5 **
            "specific-entrys"."entry-number" = "entry-items"."entry-number"

            LEFT OUTER JOIN "items"
            ON
            "items"."item-id" = "entry-items"."item-id"

            WHERE
            "entrys"."log-id" = 51
               -- order by key -- ** 6 **
               ;
-----

Here's the query plan for the query above:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=?
AND entry-number=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-----


The query uses a subselect (** 1 **) to get the latest entry number for
each and every key (i.e. to get the list of records). On my test workload
this produces 42,737 rows and takes 49ms. The efficiency of this query is
not a problem yet: If it has to precompute all the keys up-front then I'm
fine with that for now.

The query then joins the result back onto the "entrys" table to get the
rest of the per-entry metadata, onto the "entry-items" join table and then
onto the "items" table to get one row per entry per item. In my app I
coallesce these rows back together again.

The (non-explain version) query takes 679ms to run and produces 42,737 rows
because none of the entries in this particular log have more than one item
each.

Originally the two lines at (** 5 **) said

-----
"entrys"."log-id" = "entry-items"."log-id" AND
"entrys"."entry-number" = "entry-items"."entry-number"
-----

instead of

-----
51 = "entry-items"."log-id" AND -- ** 5 **
"specific-entrys"."entry-number" = "entry-items"."entry-number"
-----

As far as I can tell these are semantically equivalent but the original
version takes ~170 seconds rather than 679ms and the query plan looks like
this:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=?
AND entry-number=?)
0|2|2|SCAN TABLE entry-items
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-----

i.e. it does a SCAN TABLE in the loop rather than using the COVERING INDEX.

Do later version of SQLite manage to spot the constant propagation
opportunity?


Adding the lines at (** 4 **) allow the planner to use another index.



The query seems to produce rows ordered by "key". If I want to be able to
consume a few rows from the result set and then restart later (perhaps by
using something like the clauses at (** 2 **)) then I need to gurantee this
property.
Adding the WHERE clause at (** 6 **) does this but it changes the query
plan thus:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=?
AND entry-number=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
0|0|0|USE TEMP B-TREE FOR ORDER BY
-----

i.e. now we have a "USE TEMP B-TREE FOR ORDER BY" which (I imagine!) causes
the whole result set to be generated up front before the first row is
returned.

Now, without the order by clause in (** 2 **), we're getting the rows out
in the correct order as a side effect of the group by clause in (** 1 **)
so let's uncomment the order by clause at (** 3 **) instead.


This gets us back to the original, iterative query plan. It's got a
subquery that we scan in (sorted) order and then some inner loops which
should preserve the ordering of the result set, even if the inner loops
produce more rows.

However, we still want to gurantee this property so I uncommented the order
by clause at (** 6 **) again. Once again, the "USE TEMP B-TREE FOR ORDER
BY" line appears in the query plan.

I think I need this order by clause so that I get the correct behaviour in
my app, even if later version of the engine change the query plan. Perhaps
I lose my performance properties but at least I still get the correct
answer!


How do I get SQLite to recognise that it's already done enough work in with
this plan to get the results out in the order I specified and therefore
elide the "USE TEMP B-TREE FOR ORDER BY" parts of its plan?


Also, is there a way to find out which order SQLite thinks it is returning
the rows in (if any) so that, if one exists, I can construct the order by
clause that will cause no change to the query plan?




Thanks for any insight and help you can offer me.
I've been a long time user of SQLite so thanks for all the great work and
documentation over the years!





Here's my schema:

-----
sqlite> .schema
CREATE TABLE IF NOT EXISTS "entry-items" ("log-id"  NOT NULL ,
"entry-number"  NOT NULL , "item-id"  NOT NULL );
CREATE TABLE IF NOT EXISTS "entrys" ("log-id" INTEGER NOT NULL ,
"entry-number" INTEGER NOT NULL , "region" TEXT NOT NULL , "key" TEXT NOT
NULL , "timestamp" INTEGER NOT NULL , PRIMARY KEY ("log-id",
"entry-number"));
CREATE TABLE IF NOT EXISTS "item-digests" ("item-id" INTEGER NOT NULL ,
"algorithm" TEXT NOT NULL , "digest" BLOB NOT NULL , PRIMARY KEY
("item-id", "algorithm"));
CREATE TABLE IF NOT EXISTS "items" ("item-id" INTEGER PRIMARY KEY  NOT NULL
 UNIQUE , "blob" BLOB NOT NULL  UNIQUE );
CREATE TABLE IF NOT EXISTS "registers" ("log-id" INTEGER PRIMARY KEY  NOT
NULL  UNIQUE , "index-of" INTEGER NOT NULL , "name" TEXT NOT NULL );
CREATE INDEX "entry-items-log-id-entry-number-item-id" ON "entry-items"
("log-id" ASC, "entry-number" ASC, "item-id" ASC);
CREATE UNIQUE INDEX "entrys-region-key-entry-number-log-id" ON "entrys" (
"region" ASC, "key" ASC, "entry-number" ASC, "log-id" ASC);
CREATE UNIQUE INDEX "item-digests-algorithm-digest" ON "item-digests"
("algorithm" ASC, "digest" ASC);
CREATE UNIQUE INDEX "registers-index-of-name" ON "registers" ("index-of"
ASC, "name" ASC);
CREATE UNIQUE INDEX "entrys-log-id-region-key-entry-number-desc" ON
"entrys" ( "log-id" ASC, "region" ASC, "key" ASC, "entry-number" DESC)
-----



Regards,
@ndy

--
[hidden email]
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query Planning Knowledge

David Raymond
Can't go into as much detail as you. But a couple comments.

"primary key unique" is redundant, and will actually create a redundant unique index.

"Do later version of SQLite manage to spot the constant propagation opportunity?"
This requires way more algebra and proof than I think you realize. "entrys" is on the right side of a LEFT OUTER JOIN, and therefore may be null when it comes time to do the next OUTER JOIN to "entry-items", so a direct replacement of an "equals" constraint isn't possible. And after that it again starts becoming algebra as to whether that null can affect things etc.

For the ordering, I recommend seeing if you can replace one of those "entrys" indexes so that they start with "key" as the first field in the index. That would at least give it more opportunity to use that index for the ordering rather than needing to order things after they're all collected. That and explicitly stating the order by in _both_ the sub select and the final might make it notice "I can use this _ordered_ sub select as the outer table in the joins and get my overall ordering that way"... or it may not. Worth a try though.



-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Andy Bennett
Sent: Tuesday, January 22, 2019 8:09 AM
To: [hidden email]
Subject: [sqlite] Query Planning Knowledge

Hi,

I'm having some problems understanding what the query planner is doing and
how to convince it to work in the way I want.

Sorry that this is such a long and involved eMail. I've tried to describe
my problem and show the steps I have taken in debugging it. I have made
some effort to present a minimal and basic illustration of the problem.
Either I'm in error and glad to be assisted, or there's some opportunities
that the query planner is not taking. Hopefully my detail helps someone who
knows more than me get to the bottom of the issue!


My GUI montior is reporting SQLite 3.8.11.1 and my app is using 3.17.0. I
get the same result with both.

I've included more complete details of my schema at the bottom of this
message but only introduced the bits I'm talking about as I go through.


I'm aiming to write a query that's good for feeding an iterator and for
pagination. i.e. I don't want a query that needs to compute the whole
result up-front, before returning the first row. I want something where I
can execute the query, consume a few rows for a while, close it and then
later start a new query with a parameter based on how far I got through and
then continue where I left off.

The structure of my dataset is such that once rows are written to the
database they are not changed, so there exist a set of parameters where I
can do this.


The data is an append only log. The log consists of entries (in the entrys
table) and they have monotonically increasing "entry-number"s. The table
can store more than one log so there's a "log-id" as well. These entries
have some data of their own and each point to zero or more items which are
stored in the "items" table. Thus there's a join table "entry-items" which
consists of 3 columns, all parts of the primary key, "log-d",
"entry-number" and "item-id".

Each entry has a "key". As entries are added to the log, the "latest" entry
for a particular key describes the current state of that key. The set of
keys and their current state is called "records" and they only exist as a
query on the tables. It is this query that I'm trying to plan well.

The idea is for the app user to be able to ask for a cursor on the records
and then consume them a page at a time. I'm not particularly bothered which
order they keys come out in but it has to be deterministic so that I can
restart the query again. It's nice if they come out in "key" order but
"entry-number" order will do too. The main property I want is to be able to
pass this app-defined cursor over my API boundary without holding open a
read transaction on SQLite.

I've started with a log that isn't huge but needs some attention to make it
work efficiently. This log has 54,272 entries and 42,737 records at the
latest version. I expect to have logs with a few tens of million active
records but probably nothing larger than 100 million. Not exactly Big Data
;-) but worthy of indexing.


So, here's my query, annotated with a few numbered points so that I can
talk about things. The bits that are commented out get commented in (and
out again!) as I discuss my problems with the query plan.

-----
explain query plan
            SELECT
            "entrys"."log-id"       AS "log-id",
            "entrys"."entry-number" AS "entry-number",
            "entrys"."region"       AS "region",
            "entrys"."key"          AS "key",
            "entrys"."timestamp"    AS "timestamp",
            "entry-items"."item-id" AS "item-id",
            "items"."blob"          AS "blob"

            FROM
            (SELECT -- ** 1 **
              MAX("entry-number") AS "entry-number",
              "key"
              FROM "entrys"
              WHERE
              "log-id" = 51 AND
              "region" = "user" AND
              "entry-number" <= 54272
-- AND key > "100009" -- ** 2 **
-- AND key > "145943"
              GROUP BY "key"
                 -- order by key -- ** 3 **
                 ) AS "specific-entrys"

            LEFT OUTER JOIN "entrys"
            ON
            "specific-entrys"."entry-number" = "entrys"."entry-number"
AND "specific-entrys"."key" = "entrys"."key" -- ** 4 **
AND "user" = "entrys"."region"

            LEFT OUTER JOIN "entry-items"
            ON
            51 = "entry-items"."log-id" AND -- ** 5 **
            "specific-entrys"."entry-number" = "entry-items"."entry-number"

            LEFT OUTER JOIN "items"
            ON
            "items"."item-id" = "entry-items"."item-id"

            WHERE
            "entrys"."log-id" = 51
               -- order by key -- ** 6 **
               ;
-----

Here's the query plan for the query above:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=?
AND entry-number=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-----


The query uses a subselect (** 1 **) to get the latest entry number for
each and every key (i.e. to get the list of records). On my test workload
this produces 42,737 rows and takes 49ms. The efficiency of this query is
not a problem yet: If it has to precompute all the keys up-front then I'm
fine with that for now.

The query then joins the result back onto the "entrys" table to get the
rest of the per-entry metadata, onto the "entry-items" join table and then
onto the "items" table to get one row per entry per item. In my app I
coallesce these rows back together again.

The (non-explain version) query takes 679ms to run and produces 42,737 rows
because none of the entries in this particular log have more than one item
each.

Originally the two lines at (** 5 **) said

-----
"entrys"."log-id" = "entry-items"."log-id" AND
"entrys"."entry-number" = "entry-items"."entry-number"
-----

instead of

-----
51 = "entry-items"."log-id" AND -- ** 5 **
"specific-entrys"."entry-number" = "entry-items"."entry-number"
-----

As far as I can tell these are semantically equivalent but the original
version takes ~170 seconds rather than 679ms and the query plan looks like
this:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=?
AND entry-number=?)
0|2|2|SCAN TABLE entry-items
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-----

i.e. it does a SCAN TABLE in the loop rather than using the COVERING INDEX.

Do later version of SQLite manage to spot the constant propagation
opportunity?


Adding the lines at (** 4 **) allow the planner to use another index.



The query seems to produce rows ordered by "key". If I want to be able to
consume a few rows from the result set and then restart later (perhaps by
using something like the clauses at (** 2 **)) then I need to gurantee this
property.
Adding the WHERE clause at (** 6 **) does this but it changes the query
plan thus:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
0|0|0|SCAN SUBQUERY 1 AS specific-entrys
0|1|1|SEARCH TABLE entrys USING INDEX
entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND key=?
AND entry-number=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
0|0|0|USE TEMP B-TREE FOR ORDER BY
-----

i.e. now we have a "USE TEMP B-TREE FOR ORDER BY" which (I imagine!) causes
the whole result set to be generated up front before the first row is
returned.

Now, without the order by clause in (** 2 **), we're getting the rows out
in the correct order as a side effect of the group by clause in (** 1 **)
so let's uncomment the order by clause at (** 3 **) instead.


This gets us back to the original, iterative query plan. It's got a
subquery that we scan in (sorted) order and then some inner loops which
should preserve the ordering of the result set, even if the inner loops
produce more rows.

However, we still want to gurantee this property so I uncommented the order
by clause at (** 6 **) again. Once again, the "USE TEMP B-TREE FOR ORDER
BY" line appears in the query plan.

I think I need this order by clause so that I get the correct behaviour in
my app, even if later version of the engine change the query plan. Perhaps
I lose my performance properties but at least I still get the correct
answer!


How do I get SQLite to recognise that it's already done enough work in with
this plan to get the results out in the order I specified and therefore
elide the "USE TEMP B-TREE FOR ORDER BY" parts of its plan?


Also, is there a way to find out which order SQLite thinks it is returning
the rows in (if any) so that, if one exists, I can construct the order by
clause that will cause no change to the query plan?




Thanks for any insight and help you can offer me.
I've been a long time user of SQLite so thanks for all the great work and
documentation over the years!





Here's my schema:

-----
sqlite> .schema
CREATE TABLE IF NOT EXISTS "entry-items" ("log-id"  NOT NULL ,
"entry-number"  NOT NULL , "item-id"  NOT NULL );
CREATE TABLE IF NOT EXISTS "entrys" ("log-id" INTEGER NOT NULL ,
"entry-number" INTEGER NOT NULL , "region" TEXT NOT NULL , "key" TEXT NOT
NULL , "timestamp" INTEGER NOT NULL , PRIMARY KEY ("log-id",
"entry-number"));
CREATE TABLE IF NOT EXISTS "item-digests" ("item-id" INTEGER NOT NULL ,
"algorithm" TEXT NOT NULL , "digest" BLOB NOT NULL , PRIMARY KEY
("item-id", "algorithm"));
CREATE TABLE IF NOT EXISTS "items" ("item-id" INTEGER PRIMARY KEY  NOT NULL
 UNIQUE , "blob" BLOB NOT NULL  UNIQUE );
CREATE TABLE IF NOT EXISTS "registers" ("log-id" INTEGER PRIMARY KEY  NOT
NULL  UNIQUE , "index-of" INTEGER NOT NULL , "name" TEXT NOT NULL );
CREATE INDEX "entry-items-log-id-entry-number-item-id" ON "entry-items"
("log-id" ASC, "entry-number" ASC, "item-id" ASC);
CREATE UNIQUE INDEX "entrys-region-key-entry-number-log-id" ON "entrys" (
"region" ASC, "key" ASC, "entry-number" ASC, "log-id" ASC);
CREATE UNIQUE INDEX "item-digests-algorithm-digest" ON "item-digests"
("algorithm" ASC, "digest" ASC);
CREATE UNIQUE INDEX "registers-index-of-name" ON "registers" ("index-of"
ASC, "name" ASC);
CREATE UNIQUE INDEX "entrys-log-id-region-key-entry-number-desc" ON
"entrys" ( "log-id" ASC, "region" ASC, "key" ASC, "entry-number" DESC)
-----



Regards,
@ndy

--
[hidden email]
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query Planning Knowledge

Keith Medcalf
In reply to this post by Andy Bennett

Your description of the schema does not match the schema that you have shown.  

(As an aside, I wonder why you are using such confusing table and column names that require the puking of quotes all over the place?  You can avoid such ugliness by searching and replacing all "-" with nothingness (do not embed unlawful characters in object names) and replacing the puking quotes with nothingness as they will no longer be required since you will no longer using unlawful characters in object names.  Reformatting the queries and table definitions to readable format is trivial, but you might want to do it yourself so you can see what you are doing at a cursory glance.  There are special places where "obfuscation" is treasured, production code is generally not one of them -- especially if someone other than yourself will be exposed to it or perhaps have to maintain it sometime in the future.)


Getting on with the business at hand, however:

ONE

You claim the entry-numbers are monotonically increasing.  Are they monotonically increasing ONLY WITHIN something else, or is the entire set monotonically increasing?  Your schema indicates that the entry-numbers are monotonically increasing only in combination with the logid (that is there are duplicate entry-numbers within each log) and that therefore the entrynumber is not actually an entrynumber but is something else entirely like an overloaded timestamp or such.

Please specify whether or not entry-numbers is a candidate key for your entrys table -- your prose indicate that it is yet your schema says it is not.

Which one is the correct?


TWO

Mutatis mutandis the itemid in the items and itemdigests table wherein the declaration of itemid is inconsistent.  Does it identify an item or does it not?

Is it a candidate key within the itemdigests table or not?

Please explain.


THREE

What is the love of LEFT OUTER JOIN and why are you using it?  I know you may have heard of an outer join somewhere along the way, but can you please explain why you think the use of it is necessary in this case?


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Andy Bennett
>Sent: Tuesday, 22 January, 2019 06:09
>To: [hidden email]
>Subject: [sqlite] Query Planning Knowledge
>
>Hi,
>
>I'm having some problems understanding what the query planner is
>doing and
>how to convince it to work in the way I want.
>
>Sorry that this is such a long and involved eMail. I've tried to
>describe
>my problem and show the steps I have taken in debugging it. I have
>made
>some effort to present a minimal and basic illustration of the
>problem.
>Either I'm in error and glad to be assisted, or there's some
>opportunities
>that the query planner is not taking. Hopefully my detail helps
>someone who
>knows more than me get to the bottom of the issue!
>
>
>My GUI montior is reporting SQLite 3.8.11.1 and my app is using
>3.17.0. I
>get the same result with both.
>
>I've included more complete details of my schema at the bottom of
>this
>message but only introduced the bits I'm talking about as I go
>through.
>
>
>I'm aiming to write a query that's good for feeding an iterator and
>for
>pagination. i.e. I don't want a query that needs to compute the whole
>result up-front, before returning the first row. I want something
>where I
>can execute the query, consume a few rows for a while, close it and
>then
>later start a new query with a parameter based on how far I got
>through and
>then continue where I left off.
>
>The structure of my dataset is such that once rows are written to the
>database they are not changed, so there exist a set of parameters
>where I
>can do this.
>
>
>The data is an append only log. The log consists of entries (in the
>entrys
>table) and they have monotonically increasing "entry-number"s. The
>table
>can store more than one log so there's a "log-id" as well. These
>entries
>have some data of their own and each point to zero or more items
>which are
>stored in the "items" table. Thus there's a join table "entry-items"
>which
>consists of 3 columns, all parts of the primary key, "log-d",
>"entry-number" and "item-id".
>
>Each entry has a "key". As entries are added to the log, the "latest"
>entry
>for a particular key describes the current state of that key. The set
>of
>keys and their current state is called "records" and they only exist
>as a
>query on the tables. It is this query that I'm trying to plan well.
>
>The idea is for the app user to be able to ask for a cursor on the
>records
>and then consume them a page at a time. I'm not particularly bothered
>which
>order they keys come out in but it has to be deterministic so that I
>can
>restart the query again. It's nice if they come out in "key" order
>but
>"entry-number" order will do too. The main property I want is to be
>able to
>pass this app-defined cursor over my API boundary without holding
>open a
>read transaction on SQLite.
>
>I've started with a log that isn't huge but needs some attention to
>make it
>work efficiently. This log has 54,272 entries and 42,737 records at
>the
>latest version. I expect to have logs with a few tens of million
>active
>records but probably nothing larger than 100 million. Not exactly Big
>Data
>;-) but worthy of indexing.
>
>
>So, here's my query, annotated with a few numbered points so that I
>can
>talk about things. The bits that are commented out get commented in
>(and
>out again!) as I discuss my problems with the query plan.
>
>-----
>explain query plan
>    SELECT
>    "entrys"."log-id"       AS "log-id",
>    "entrys"."entry-number" AS "entry-number",
>    "entrys"."region"       AS "region",
>    "entrys"."key"          AS "key",
>    "entrys"."timestamp"    AS "timestamp",
>    "entry-items"."item-id" AS "item-id",
>    "items"."blob"          AS "blob"
>
>    FROM
>    (SELECT -- ** 1 **
>      MAX("entry-number") AS "entry-number",
>      "key"
>      FROM "entrys"
>      WHERE
>      "log-id" = 51 AND
>      "region" = "user" AND
>      "entry-number" <= 54272
>-- AND key > "100009" -- ** 2 **
>-- AND key > "145943"
>      GROUP BY "key"
>                 -- order by key -- ** 3 **
>                 ) AS "specific-entrys"
>
>    LEFT OUTER JOIN "entrys"
>    ON
>    "specific-entrys"."entry-number" = "entrys"."entry-number"
>AND "specific-entrys"."key" = "entrys"."key" -- ** 4 **
>AND "user" = "entrys"."region"
>
>    LEFT OUTER JOIN "entry-items"
>    ON
>    51 = "entry-items"."log-id" AND -- ** 5 **
>    "specific-entrys"."entry-number" = "entry-items"."entry-
>number"
>
>    LEFT OUTER JOIN "items"
>    ON
>    "items"."item-id" = "entry-items"."item-id"
>
>    WHERE
>    "entrys"."log-id" = 51
>               -- order by key -- ** 6 **
>               ;
>-----
>
>Here's the query plan for the query above:
>
>-----
>1|0|0|SEARCH TABLE entrys USING COVERING INDEX
>entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
>0|0|0|SCAN SUBQUERY 1 AS specific-entrys
>0|1|1|SEARCH TABLE entrys USING INDEX
>entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND
>key=?
>AND entry-number=?)
>0|2|2|SEARCH TABLE entry-items USING COVERING INDEX
>entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
>0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
>-----
>
>
>The query uses a subselect (** 1 **) to get the latest entry number
>for
>each and every key (i.e. to get the list of records). On my test
>workload
>this produces 42,737 rows and takes 49ms. The efficiency of this
>query is
>not a problem yet: If it has to precompute all the keys up-front then
>I'm
>fine with that for now.
>
>The query then joins the result back onto the "entrys" table to get
>the
>rest of the per-entry metadata, onto the "entry-items" join table and
>then
>onto the "items" table to get one row per entry per item. In my app I
>coallesce these rows back together again.
>
>The (non-explain version) query takes 679ms to run and produces
>42,737 rows
>because none of the entries in this particular log have more than one
>item
>each.
>
>Originally the two lines at (** 5 **) said
>
>-----
>"entrys"."log-id" = "entry-items"."log-id" AND
>"entrys"."entry-number" = "entry-items"."entry-number"
>-----
>
>instead of
>
>-----
>51 = "entry-items"."log-id" AND -- ** 5 **
>"specific-entrys"."entry-number" = "entry-items"."entry-number"
>-----
>
>As far as I can tell these are semantically equivalent but the
>original
>version takes ~170 seconds rather than 679ms and the query plan looks
>like
>this:
>
>-----
>1|0|0|SEARCH TABLE entrys USING COVERING INDEX
>entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
>0|0|0|SCAN SUBQUERY 1 AS specific-entrys
>0|1|1|SEARCH TABLE entrys USING INDEX
>entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND
>key=?
>AND entry-number=?)
>0|2|2|SCAN TABLE entry-items
>0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
>-----
>
>i.e. it does a SCAN TABLE in the loop rather than using the COVERING
>INDEX.
>
>Do later version of SQLite manage to spot the constant propagation
>opportunity?
>
>
>Adding the lines at (** 4 **) allow the planner to use another index.
>
>
>
>The query seems to produce rows ordered by "key". If I want to be
>able to
>consume a few rows from the result set and then restart later
>(perhaps by
>using something like the clauses at (** 2 **)) then I need to
>gurantee this
>property.
>Adding the WHERE clause at (** 6 **) does this but it changes the
>query
>plan thus:
>
>-----
>1|0|0|SEARCH TABLE entrys USING COVERING INDEX
>entrys-log-id-region-key-entry-number-desc (log-id=? AND region=?)
>0|0|0|SCAN SUBQUERY 1 AS specific-entrys
>0|1|1|SEARCH TABLE entrys USING INDEX
>entrys-log-id-region-key-entry-number-desc (log-id=? AND region=? AND
>key=?
>AND entry-number=?)
>0|2|2|SEARCH TABLE entry-items USING COVERING INDEX
>entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
>0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
>0|0|0|USE TEMP B-TREE FOR ORDER BY
>-----
>
>i.e. now we have a "USE TEMP B-TREE FOR ORDER BY" which (I imagine!)
>causes
>the whole result set to be generated up front before the first row is
>returned.
>
>Now, without the order by clause in (** 2 **), we're getting the rows
>out
>in the correct order as a side effect of the group by clause in (** 1
>**)
>so let's uncomment the order by clause at (** 3 **) instead.
>
>
>This gets us back to the original, iterative query plan. It's got a
>subquery that we scan in (sorted) order and then some inner loops
>which
>should preserve the ordering of the result set, even if the inner
>loops
>produce more rows.
>
>However, we still want to gurantee this property so I uncommented the
>order
>by clause at (** 6 **) again. Once again, the "USE TEMP B-TREE FOR
>ORDER
>BY" line appears in the query plan.
>
>I think I need this order by clause so that I get the correct
>behaviour in
>my app, even if later version of the engine change the query plan.
>Perhaps
>I lose my performance properties but at least I still get the correct
>answer!
>
>
>How do I get SQLite to recognise that it's already done enough work
>in with
>this plan to get the results out in the order I specified and
>therefore
>elide the "USE TEMP B-TREE FOR ORDER BY" parts of its plan?
>
>
>Also, is there a way to find out which order SQLite thinks it is
>returning
>the rows in (if any) so that, if one exists, I can construct the
>order by
>clause that will cause no change to the query plan?
>
>
>
>
>Thanks for any insight and help you can offer me.
>I've been a long time user of SQLite so thanks for all the great work
>and
>documentation over the years!
>
>
>
>
>
>Here's my schema:
>
>-----
>sqlite> .schema
>CREATE TABLE IF NOT EXISTS "entry-items" ("log-id"  NOT NULL ,
>"entry-number"  NOT NULL , "item-id"  NOT NULL );
>CREATE TABLE IF NOT EXISTS "entrys" ("log-id" INTEGER NOT NULL ,
>"entry-number" INTEGER NOT NULL , "region" TEXT NOT NULL , "key" TEXT
>NOT
>NULL , "timestamp" INTEGER NOT NULL , PRIMARY KEY ("log-id",
>"entry-number"));
>CREATE TABLE IF NOT EXISTS "item-digests" ("item-id" INTEGER NOT NULL
>,
>"algorithm" TEXT NOT NULL , "digest" BLOB NOT NULL , PRIMARY KEY
>("item-id", "algorithm"));
>CREATE TABLE IF NOT EXISTS "items" ("item-id" INTEGER PRIMARY KEY
>NOT NULL
> UNIQUE , "blob" BLOB NOT NULL  UNIQUE );
>CREATE TABLE IF NOT EXISTS "registers" ("log-id" INTEGER PRIMARY KEY
>NOT
>NULL  UNIQUE , "index-of" INTEGER NOT NULL , "name" TEXT NOT NULL );
>CREATE INDEX "entry-items-log-id-entry-number-item-id" ON "entry-
>items"
>("log-id" ASC, "entry-number" ASC, "item-id" ASC);
>CREATE UNIQUE INDEX "entrys-region-key-entry-number-log-id" ON
>"entrys" (
>"region" ASC, "key" ASC, "entry-number" ASC, "log-id" ASC);
>CREATE UNIQUE INDEX "item-digests-algorithm-digest" ON "item-digests"
>("algorithm" ASC, "digest" ASC);
>CREATE UNIQUE INDEX "registers-index-of-name" ON "registers" ("index-
>of"
>ASC, "name" ASC);
>CREATE UNIQUE INDEX "entrys-log-id-region-key-entry-number-desc" ON
>"entrys" ( "log-id" ASC, "region" ASC, "key" ASC, "entry-number"
>DESC)
>-----
>
>
>
>Regards,
>@ndy
>
>--
>[hidden email]
>http://www.ashurst.eu.org/
>0x7EBA75FF
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query Planning Knowledge

Andy Bennett
In reply to this post by David Raymond
Hi David,

Thanks for your thoughtful reply.


> Can't go into as much detail as you. But a couple comments.
>
> "primary key unique" is redundant, and will actually create a
> redundant unique index.

Are you refering to the CREATE TABLE clauses for the "items" and
"registers" tables?


I appear to have indexes named "sqlite_autoindex_items_1",
"sqlite_autoindex_items_2" (for the blob column) and
"sqlite_autoindex_registers_1".

If I run

CREATE TABLE "items2" ("item-id" INTEGER PRIMARY KEY  NOT NULL   , "blob"
BLOB NOT NULL  UNIQUE )

...then I just get "sqlite_auto_index_items2_1" for the blob colum.


Thanks for that: it's a good find! I'll do the same for the "registers"
table as well.



> "Do later version of SQLite manage to spot the constant
> propagation opportunity?"
> This requires way more algebra and proof than I think you
> realize. "entrys" is on the right side of a LEFT OUTER JOIN, and
> therefore may be null when it comes time to do the next OUTER
> JOIN to "entry-items", so a direct replacement of an "equals"
> constraint isn't possible. And after that it again starts
> becoming algebra as to whether that null can affect things etc.

Yes, you are right. I could use the inner join for the "entrys" join and
the "items" join but not the "entry-items" join because each entry can have
more than one item.

Changing the "entrys" join to an INNER JOIN and adding both ORDER BY
clauses gets me a much better query plan:

-----
1|0|0|SEARCH TABLE entrys USING COVERING INDEX
entrys-log-id-region-key-entry-number (log-id=? AND region=?)
0|0|1|SEARCH TABLE entrys USING INDEX entrys-log-id-region-key-entry-number
(log-id=? AND region=?)
0|1|0|SEARCH SUBQUERY 1 AS specific-entrys USING AUTOMATIC COVERING INDEX
(key=?)
0|2|2|SEARCH TABLE entry-items USING COVERING INDEX
entry-items-log-id-entry-number-item-id (log-id=? AND entry-number=?)
0|3|3|SEARCH TABLE items USING INTEGER PRIMARY KEY (rowid=?)
-----

Changing the "items" join to an INNER JOIN doesn't get me any more but I'll
do it anyway as the OUTER JOIN is not strictly needed.


Thanks for that!

I started with an OUTER JOIN as I find it easier to show that it's doing
the correct thing because I can search the output for errant NULLs. Trying
to detect missing rows in an INNER JOIN is harder. I then failed to realise
that an INNER JOIN would suffice.

It now even does the constant propagation of the log-id at (** 5 **) but
not the "entry-number" one!



> For the ordering, I recommend seeing if you can replace one of
> those "entrys" indexes so that they start with "key" as the
> first field in the index. That would at least give it more
> opportunity to use that index for the ordering rather than
> needing to order things after they're all collected. That and
> explicitly stating the order by in _both_ the sub select and the
> final might make it notice "I can use this _ordered_ sub select
> as the outer table in the joins and get my overall ordering that
> way"... or it may not. Worth a try though.

Yeah. I've had a fiddle around and can't get it to do it. However,
converting to an INNER JOIN as above seems to get it to work without
changing the indexes tho'.

In the schema I have,

CREATE UNIQUE INDEX "entrys-log-id-region-key-entry-number" ON "entrys" (
"log-id" ASC, "region" ASC, "key" ASC, "entry-number" ASC)

...which seems to get used for the sub-query. It gives it its ordering and
the "key" part also gets used when I add the extra WHERE constraint on
"key" (** 2**) so it seems to be having the effect you mention.

If I add an ORDER BY clause to the sub-select (** 3 **) the query plan
doesn't change so I guess it's happy to use that index to get the ordering.

If I then also add an ORDER BY to the main select it still uses a temporary
b-tree to confirm the sort. This is the main source of my confusion because
the query plan shows that the joins are all executed as inner loops so the
ordering of the outer query should be preserved



Thanks for your help David and thanks for solving the problem!




Regards,
@ndy

--
[hidden email]
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query Planning Knowledge

Clemens Ladisch
Andy Bennett wrote:
> I could use the inner join for the "entrys" join and the "items" join
> but not the "entry-items" join because each entry can have more than
> one item.

  WITH a(id, name) AS (VALUES (1, 'A')),
       b(id, name) AS (VALUES (1, 'B1'), (1, 'B2'))
  SELECT * FROM a INNER JOIN b USING (id);

  1|A|B1
  1|A|B2

The only difference between inner and outer joins is how rows without
any match are handled.

> I started with an OUTER JOIN as I find it easier to show that it's
> doing the correct thing because I can search the output for errant
> NULLs. Trying to detect missing rows in an INNER JOIN is harder.

If the join columns have the same name, using USING is easier.

And it would be a good idea to enforce the relationships between the
tables with foreign key constraints: <https://www.sqlite.org/foreignkeys.html>
(However, constraints do not affect how you have to write your queries.)


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query Planning Knowledge

Andy Bennett
Hi,

>> I could use the inner join for the "entrys" join and the "items" join
>> but not the "entry-items" join because each entry can have more than
>> one item.
>
>   WITH a(id, name) AS (VALUES (1, 'A')),
>        b(id, name) AS (VALUES (1, 'B1'), (1, 'B2'))
>   SELECT * FROM a INNER JOIN b USING (id);
>
>   1|A|B1
>   1|A|B2
>
> The only difference between inner and outer joins is how rows without
> any match are handled.

Thanks Clemens! In my case entries can have zero items as well and I still
want the entry itself to show up.


>> I started with an OUTER JOIN as I find it easier to show that it's
>> doing the correct thing because I can search the output for errant
>> NULLs. Trying to detect missing rows in an INNER JOIN is harder.
>
> If the join columns have the same name, using USING is easier.
>
> And it would be a good idea to enforce the relationships between the
> tables with foreign key constraints:
> <https://www.sqlite.org/foreignkeys.html>
> (However, constraints do not affect how you have to write your queries.)

Ah yes. It might be worth looking at this. I've always avoided it in the
past because my experience with other engines taught me that it makes
experimenting at the monitor harder. Are there any efficiency benefits or
is it just there to enforce data integrity (very important, of course;-))?

It looks like they have to be enabled on a per connection basis. In this
case I (currently) control all the client code but is it possible for the
foreign key relationships to get out of sync if one of the connections
omits to apply the pragma?


Thanks for the tips!




Regards,
@ndy

--
[hidden email]
http://www.ashurst.eu.org/
0x7EBA75FF
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query Planning Knowledge

Clemens Ladisch
Andy Bennett wrote:
>> foreign key constraints
>
> my experience with other engines taught me that it makes experimenting at the monitor harder.

Then don't use them. :)  But do you actually want 'wrong' data?

> Are there any efficiency benefits or is it just there to enforce data integrity?

Constraints just are additional checks.
(FKs require certain indexes, but you would want to have those anyway.)

> It looks like they have to be enabled on a per connection basis. In this case I (currently)
> control all the client code but is it possible for the foreign key relationships to get out
> of sync if one of the connections omits to apply the pragma?

Yes.  You could run PRAGMA foreign_key_check afterwards.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users