Index help...

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Index help...

Hamesh Shah
I need a little help with some strange indexing behaviour.

I have a table called detected.

i create a index for:
id integer, confidence ASC, timestamp ASC

Then when I query with a simple select from where with integer, then
confidence, then timestamp in order, for some reason the timestamp index
isn't used ?


SEARCH TABLE detected USING COVERING INDEX detected_model_id_confidence_ts
(model_id=? AND confidence>?)


I read the website, I tried it many times around and still no joy. I can't
see why it's not using the timestamp that is already ordered for my sql ts
> and ts < statement.



Python versions:

sqlite3.version 2.6.0 / python api version.

*sqlite3.sqlite_version 3.24.0*


table standalone:

CREATE TABLE detected ( id INTEGER PRIMARY KEY, model_id integer NOT NULL,
state_id integer NOT NULL, dataset_id integer NOT NULL, class_id integer
NOT NULL, confidence REAL NOT NULL, ts DATETIME NOT NULL, x0 INTEGER NOT
NULL, y0 INTEGER NOT NULL, x1 INTEGER NOT NULL, y1 INTEGER NOT NULL,
file_id INTEGER NOT NULL )

index creation:

CREATE INDEX `detected_model_id_confidence_ts` ON `detected` (
`model_id`,
`confidence` ASC,
`ts` ASC
);


I can't see the timestamp being used:

explain query plan
select distinct ts
from detected
where
model_id = 1
and
confidence > 0.8
and
ts >  '2018-10-10 01:25:25'
and
ts < '2018-10-23 08:10:17'
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Index help...

Marc L. Allen
I’m not the expert here, but it appears that the cause is that your looking for things greater than some confidence. This forces an index scan. There’s nothing that gives a list of different confidences greater than, in this case .8, but even if it did, an index scan might be faster than individual lookups.

The DB has no way of knowing that, after it reaches your high timestamp that it should skip to the next confidence value, since it doesn’t know what that next value is. So, it has to read all the index records, making the timestamp part of the index only used for the value.

As an example, perform your lookup use confidence = .8 and you may see ta being used.

> On Oct 23, 2018, at 6:48 PM, Hamesh Shah <[hidden email]> wrote:
>
> I need a little help with some strange indexing behaviour.
>
> I have a table called detected.
>
> i create a index for:
> id integer, confidence ASC, timestamp ASC
>
> Then when I query with a simple select from where with integer, then
> confidence, then timestamp in order, for some reason the timestamp index
> isn't used ?
>
>
> SEARCH TABLE detected USING COVERING INDEX detected_model_id_confidence_ts
> (model_id=? AND confidence>?)
>
>
> I read the website, I tried it many times around and still no joy. I can't
> see why it's not using the timestamp that is already ordered for my sql ts
>> and ts < statement.
>
>
>
> Python versions:
>
> sqlite3.version 2.6.0 / python api version.
>
> *sqlite3.sqlite_version 3.24.0*
>
>
> table standalone:
>
> CREATE TABLE detected ( id INTEGER PRIMARY KEY, model_id integer NOT NULL,
> state_id integer NOT NULL, dataset_id integer NOT NULL, class_id integer
> NOT NULL, confidence REAL NOT NULL, ts DATETIME NOT NULL, x0 INTEGER NOT
> NULL, y0 INTEGER NOT NULL, x1 INTEGER NOT NULL, y1 INTEGER NOT NULL,
> file_id INTEGER NOT NULL )
>
> index creation:
>
> CREATE INDEX `detected_model_id_confidence_ts` ON `detected` (
> `model_id`,
> `confidence` ASC,
> `ts` ASC
> );
>
>
> I can't see the timestamp being used:
>
> explain query plan
> select distinct ts
> from detected
> where
> model_id = 1
> and
> confidence > 0.8
> and
> ts >  '2018-10-10 01:25:25'
> and
> ts < '2018-10-23 08:10:17'
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Confidentiality notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged, company confidential and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender by reply e-mail or collect telephone call and delete or destroy all copies of this e-mail message, any physical copies made of this e-mail message and/or any file attachment(s).
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Index help...

Simon Slavin-3
In reply to this post by Hamesh Shah
On 23 Oct 2018, at 11:45pm, Hamesh Shah <[hidden email]> wrote:

> CREATE INDEX `detected_model_id_confidence_ts` ON `detected` (
> `model_id`,
> `confidence` ASC,
> `ts` ASC
> );

Create another index with the fields in this order

> `model_id` ASC,
> `ts` ASC,
> `confidence` ASC

and try it again.  By the way, avoid using those ` quotes (andh anything else like []) in any SQL command if you did not use them when you created the fields in the first place.  In this particular case they'll work, but it's a bad habit to get into if you ever expect to use a different SQL engine.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Index help...

Keith Medcalf
In reply to this post by Hamesh Shah

EXPLAIN QUERY PLAN shows the "High Level" outline of the plan for executing your query, primarily the constraints imposed on indexes, but not the WHERE conditions that are not used to constrain an index lookup.  

It does not show the "code" that is executed.  Use EXPLAIN rather than EXPLAIN QUERY PLAN if you want to see the actual code that is executed.

In the CLI, you can use the ".eqp" command to automagically preface the SQL with EXPLAIN QUERY PLAN (.oqp on) or both EXPLAIN QUERY PLAN and EXPLAIN (.eqp full), in addition to actually executing your SQL statement.  ".eqp off" turns off the automagical explain query plan.


SQLite version 3.26.0 2018-10-23 13:48:19
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE detected ( id INTEGER PRIMARY KEY, model_id integer NOT NULL,
   ...> state_id integer NOT NULL, dataset_id integer NOT NULL, class_id integer
   ...> NOT NULL, confidence REAL NOT NULL, ts DATETIME NOT NULL, x0 INTEGER NOT
   ...> NULL, y0 INTEGER NOT NULL, x1 INTEGER NOT NULL, y1 INTEGER NOT NULL,
   ...> file_id INTEGER NOT NULL )
   ...> ;
sqlite> CREATE INDEX `detected_model_id_confidence_ts` ON `detected` (
   ...> `model_id`,
   ...> `confidence` ASC,
   ...> `ts` ASC
   ...> );
sqlite> .eqp full
sqlite> select distinct ts
   ...> from detected
   ...> where
   ...> model_id = 1
   ...> and
   ...> confidence > 0.8
   ...> and
   ...> ts >  '2018-10-10 01:25:25'
   ...> and
   ...> ts < '2018-10-23 08:10:17'
   ...> ;
QUERY PLAN
|--SEARCH TABLE detected USING COVERING INDEX detected_model_id_confidence_ts (model_id=? AND confidence>?) (~2 rows)
`--USE TEMP B-TREE FOR DISTINCT
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     24    0                    00  Start at 24
1     OpenEphemeral  1     0     0     k(1,B)         08  nColumn=0
2     OpenRead       2     3     0     k(4,,,,)       00  root=3 iDb=0; detected_model_id_confidence_ts
3     Explain        3     0     0     SEARCH TABLE detected USING COVERING INDEX detected_model_id_confidence_ts (model_id=? AND confidence>?) (~2 rows)  00
4     Noop           0     0     0                    00  Begin WHERE-loop0: detected
5     CursorHint     2     0     0     AND(AND(AND(EQ(c0,1),GT(c1,expr)),GT(c2,'2018-10-10 01:25:25')),LT(c2,'2018-10-23 08:10:17'))  00
6     Integer        1     1     0                    00  r[1]=1
7     Real           0     2     0     0.8            00  r[2]=0.8
8     SeekGT         2     22    1     2              00  key=r[1..2]
9       IdxGT          2     22    1     1              00  key=r[1]
10      Column         2     2     3                    00  r[3]=detected.ts
11      Le             4     21    3     (BINARY)       53  if r[3]<=r[4] goto 21
12      Column         2     2     3                    00  r[3]=detected.ts
13      Ge             5     21    3     (BINARY)       53  if r[3]>=r[5] goto 21
14      Noop           0     0     0                    00  Begin WHERE-core
15      Column         2     2     6                    00  r[6]=detected.ts
16      Found          1     21    6     1              00  key=r[6]
17      MakeRecord     6     1     3                    00  r[3]=mkrec(r[6])
18      IdxInsert      1     3     6     1              10  key=r[3]
19      ResultRow      6     1     0                    00  output=r[6]
20      Noop           0     0     0                    00  End WHERE-core
21    Next           2     9     0                    00
22    Noop           0     0     0                    00  End WHERE-loop0: detected
23    Halt           0     0     0                    00
24    Transaction    0     0     2     0              01  usesStmtJournal=0
25    String8        0     4     0     2018-10-10 01:25:25  00  r[4]='2018-10-10 01:25:25'
26    String8        0     5     0     2018-10-23 08:10:17  00  r[5]='2018-10-23 08:10:17'
27    Goto           0     1     0                    00
sqlite>

As you can see, the constraints on ts *are* used, they simply are not used in the index lookup.  Now, if you "changed" the ordering of the index, then ts could be used to constrain the index and the confidence > 0.8 could not (that clause simply constraining the results).

sqlite> .eqp off
sqlite> CREATE INDEX `detected_model_id_ts_confidence` ON `detected` (
   ...> `model_id`,
   ...> `ts` ASC,
   ...> `confidence` ASC
   ...> );
sqlite> .eqp full
sqlite> select distinct ts
   ...> from detected
   ...> where
   ...> model_id = 1
   ...> and
   ...> confidence > 0.8
   ...> and
   ...> ts >  '2018-10-10 01:25:25'
   ...> and
   ...> ts < '2018-10-23 08:10:17'
   ...> ;
QUERY PLAN
`--SEARCH TABLE detected USING COVERING INDEX detected_model_id_ts_confidence (model_id=? AND ts>? AND ts<?) (~1 row)
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     25    0                    00  Start at 25
1     Null           1     6     0                    08  r[6]=NULL
2     OpenRead       2     4     0     k(4,,,,)       00  root=4 iDb=0; detected_model_id_ts_confidence
3     Explain        3     0     0     SEARCH TABLE detected USING COVERING INDEX detected_model_id_ts_confidence (model_id=? AND ts>? AND ts<?) (~1 row)  00
4     Noop           0     0     0                    00  Begin WHERE-loop0: detected
5     CursorHint     2     0     0     AND(AND(AND(EQ(c0,1),GT(c2,expr)),GT(c1,'2018-10-10 01:25:25')),LT(c1,'2018-10-23 08:10:17'))  00
6     Integer        1     1     0                    00  r[1]=1
7     String8        0     2     0     2018-10-10 01:25:25  00  r[2]='2018-10-10 01:25:25'
8     Affinity       2     1     0     C              00  affinity(r[2])
9     SeekGT         2     23    1     2              00  key=r[1..2]
10    String8        0     2     0     2018-10-23 08:10:17  00  r[2]='2018-10-23 08:10:17'
11    Affinity       2     1     0     C              00  affinity(r[2])
12      IdxGE          2     23    1     2              00  key=r[1..2]
13      Column         2     2     3                    00  r[3]=detected.confidence
14      RealAffinity   3     0     0                    00
15      Le             4     22    3     (BINARY)       55  if r[3]<=r[4] goto 22
16      Noop           0     0     0                    00  Begin WHERE-core
17      Column         2     1     5                    00  r[5]=detected.ts
18      Eq             5     22    6     (BINARY)       80  if r[6]==r[5] goto 22
19      Copy           5     6     0                    00  r[6]=r[5]
20      ResultRow      5     1     0                    00  output=r[5]
21      Noop           0     0     0                    00  End WHERE-core
22    Next           2     12    0                    00
23    Noop           0     0     0                    00  End WHERE-loop0: detected
24    Halt           0     0     0                    00
25    Transaction    0     0     3     0              01  usesStmtJournal=0
26    Real           0     4     0     0.8            00  r[4]=0.8
27    Goto           0     1     0                    00
sqlite>

Of course, the plan might change if there was actual data in the table and there were distribution statistics available.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Hamesh Shah
>Sent: Tuesday, 23 October, 2018 16:45
>To: [hidden email]
>Subject: [sqlite] Index help...
>
>I need a little help with some strange indexing behaviour.
>
>I have a table called detected.
>
>i create a index for:
>id integer, confidence ASC, timestamp ASC
>
>Then when I query with a simple select from where with integer, then
>confidence, then timestamp in order, for some reason the timestamp
>index
>isn't used ?
>
>
>SEARCH TABLE detected USING COVERING INDEX
>detected_model_id_confidence_ts
>(model_id=? AND confidence>?)
>
>
>I read the website, I tried it many times around and still no joy. I
>can't
>see why it's not using the timestamp that is already ordered for my
>sql ts
>> and ts < statement.
>
>
>
>Python versions:
>
>sqlite3.version 2.6.0 / python api version.
>
>*sqlite3.sqlite_version 3.24.0*
>
>
>table standalone:
>
>CREATE TABLE detected ( id INTEGER PRIMARY KEY, model_id integer NOT
>NULL,
>state_id integer NOT NULL, dataset_id integer NOT NULL, class_id
>integer
>NOT NULL, confidence REAL NOT NULL, ts DATETIME NOT NULL, x0 INTEGER
>NOT
>NULL, y0 INTEGER NOT NULL, x1 INTEGER NOT NULL, y1 INTEGER NOT NULL,
>file_id INTEGER NOT NULL )
>
>index creation:
>
>CREATE INDEX `detected_model_id_confidence_ts` ON `detected` (
>`model_id`,
>`confidence` ASC,
>`ts` ASC
>);
>
>
>I can't see the timestamp being used:
>
>explain query plan
>select distinct ts
>from detected
>where
>model_id = 1
>and
>confidence > 0.8
>and
>ts >  '2018-10-10 01:25:25'
>and
>ts < '2018-10-23 08:10:17'
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Index help...

Hick Gunter
In reply to this post by Hamesh Shah
There is no datetime type in SQLite. You are storing ISO Text representations so you should declare the column as TEXT.

An Index is only useable for a prefix of equality contstraints followed by ONE inequality constraint.

From your index (model_id, confidence, ts)  and your query constraints ( '=', '>', 'BETWEEN') this means that model_id is useable (equality constraint) and confidence (inequality constraint>), which allows the Engine to locate the first record with model_id == 1 and confidence > 0.8 and partial scan the table until the model_id changes. Note that the ts field is not guaranteed to be ascending within this interval (a record with higher confidence but smaller timestamp may follow any given record in the scann portionof the table. Thus the ts constraint needs to be handled without the index.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Hamesh Shah
Gesendet: Mittwoch, 24. Oktober 2018 00:45
An: [hidden email]
Betreff: [EXTERNAL] [sqlite] Index help...

I need a little help with some strange indexing behaviour.

I have a table called detected.

i create a index for:
id integer, confidence ASC, timestamp ASC

Then when I query with a simple select from where with integer, then confidence, then timestamp in order, for some reason the timestamp index isn't used ?


SEARCH TABLE detected USING COVERING INDEX detected_model_id_confidence_ts (model_id=? AND confidence>?)


I read the website, I tried it many times around and still no joy. I can't see why it's not using the timestamp that is already ordered for my sql ts
> and ts < statement.



Python versions:

sqlite3.version 2.6.0 / python api version.

*sqlite3.sqlite_version 3.24.0*


table standalone:

CREATE TABLE detected ( id INTEGER PRIMARY KEY, model_id integer NOT NULL, state_id integer NOT NULL, dataset_id integer NOT NULL, class_id integer NOT NULL, confidence REAL NOT NULL, ts DATETIME NOT NULL, x0 INTEGER NOT NULL, y0 INTEGER NOT NULL, x1 INTEGER NOT NULL, y1 INTEGER NOT NULL, file_id INTEGER NOT NULL )

index creation:

CREATE INDEX `detected_model_id_confidence_ts` ON `detected` ( `model_id`, `confidence` ASC, `ts` ASC );


I can't see the timestamp being used:

explain query plan
select distinct ts
from detected
where
model_id = 1
and
confidence > 0.8
and
ts >  '2018-10-10 01:25:25'
and
ts < '2018-10-23 08:10:17'
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users