How to optimise a somewhat-recursive query?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

How to optimise a somewhat-recursive query?

Jonathan Moules-3
Hi List,
I have a query that's quite slow on a small sample of data and I'd like
to optimise it before larger datasets get in there. With just 1000 rows
in each table I'm currently seeing query times of ~0.2 seconds. My
attempts to optimise it so far by putting in carefully thought out
indexes have _increased_ the time to 0.4s! Hmmmm, so much for "carefully
thought out". So I've removed all the indexes again and am asking the
experts here for some guidance.

The reduced table structure is as below. It's for some web-crawling -
basically each URL will be stored once in the "urls" table. It also
keeps a reference to what the original source url that ended up coming
to this url was (that's source_seed_id - it's used in this query), and
the direct parent (parent_id).
Then any time the application scrapes one of the url's, an entry is
written into the lookups table.

The idea behind the query is to see whether during the last crawl the
application got to the last page in the paginated "chain" (it will have
a specific error code which the application is looking for). This means
it's somewhat recursive.
SQLite IDE version (where the timings are coming from): 3.15.0
SQlite version behind the application itself (not timed the query here,
but it's still slow): 3.22.0


CREATE TABLE urls (
     url_id         INTEGER  PRIMARY KEY AUTOINCREMENT,
     url            TEXT     UNIQUE,
     parent_id        INTEGER  REFERENCES urls (url_id)
     source_seed_id INTEGER  REFERENCES urls (url_id)
);


CREATE TABLE lookups (
     url_id               INTEGER  REFERENCES urls (url_id),
     retrieval_datetime   DATETIME,
     error_code           INTEGER,
     is_generic_flag      BOOLEAN  -- about one in 100 will have this
flagged
);


-- This view is used as a convenience, including in the problem query below

CREATE VIEW v_most_recent_lookup_per_url AS
     SELECT url_id,
            MAX(retrieval_datetime) AS retrieval_datetime
       FROM lookups
      GROUP BY url_id
      ORDER BY url_id;

------------
The query itself:
------------

     SELECT
         u.url, l.error_code
     FROM
         urls u
     LEFT JOIN
         lookups l
         USING(url_id)
     LEFT JOIN
         v_most_recent_lookup_per_url recent
         -- This is the important bit
         -- Here we use the most recent lookup url_id to link to the
source_seed_id, so we only find its children
         -- Or alternatively itself
         ON u.source_seed_id = recent.url_id
             OR u.url_id = recent.url_id
     WHERE
         l.is_generic_flag = 1
         AND
         -- Must be "or equal to" so we can get the lookup of the very
base url.
         l.retrieval_datetime >= recent.retrieval_datetime
         AND
         DATETIME(recent.retrieval_datetime) > DATETIME('now', '-5 days')
         AND
         recent.url_id in (
             select url_id from urls where url = 'example.com'
             )
     -- By definition url's can only be inserted in pagination order, so
we can order by url_id.
     ORDER BY
         u.url_id DESC
     LIMIT 1;



The current EXPLAIN QUERY PLAN:

1    0    0    SCAN TABLE lookups
1    0    0    USE TEMP B-TREE FOR GROUP BY
0    0    0    SCAN TABLE urls AS u
0    1    1    SEARCH TABLE lookups AS l USING AUTOMATIC COVERING INDEX
(is_generic_flag=? AND url_id=?)
0    2    2    SCAN SUBQUERY 1 AS recent
0    0    0    EXECUTE LIST SUBQUERY 2
2    0    0    SEARCH TABLE urls USING COVERING INDEX
sqlite_autoindex_urls_1 (url=?)


And the EXPLAIN:


0    Init    0    118    0        00
1    Integer    50    1    0        00
2    Once    0    50    0        00
3    Noop    5    4    0        00
4    OpenEphemeral    2    2    0        00
5    SorterOpen    6    2    0    k(1,B)    00
6    Integer    0    6    0        00
7    Integer    0    5    0        00
8    Null    0    9    9        00
9    Gosub    8    48    0        00
10    OpenRead    3    7    0    3    00
11    Rewind    3    17    0        00
12    Column    3    0    11        00
13    Column    3    2    12        00
14    MakeRecord    11    2    13        00
15    SorterInsert    6    13    0        00
16    Next    3    12    0        01
17    Close    3    0    0        00
18    OpenPseudo    7    13    2        00
19    SorterSort    6    50    0        00
20    SorterData    6    13    7        00
21    Column    7    0    10        00
22    Compare    9    10    1    k(1,B)    00
23    Jump    24    28    24        00
24    Move    10    9    1        00
25    Gosub    7    39    0        00
26    IfPos    6    50    0        00
27    Gosub    8    48    0        00
28    Column    7    1    14        00
29    CollSeq    15    0    0    (BINARY)    00
30    AggStep0    0    14    3    max(1)    01
31    If    15    33    0        00
32    Column    7    0    2        00
33    Integer    1    5    0        00
34    SorterNext    6    20    0        00
35    Gosub    7    39    0        00
36    Goto    0    50    0        00
37    Integer    1    6    0        00
38    Return    7    0    0        00
39    IfPos    5    41    0        00
40    Return    7    0    0        00
41    AggFinal    3    1    0    max(1)    00
42    SCopy    2    16    0        00
43    SCopy    3    17    0        00
44    MakeRecord    16    2    14        00
45    NewRowid    2    18    0        00
46    Insert    2    14    18        08
47    Return    7    0    0        00
48    Null    0    2    4        00
49    Return    8    0    0        00
50    Return    1    0    0        00
51    Noop    8    4    0        00
52    Integer    1    19    0        00
53    OpenRead    0    8    0    5    00
54    OpenRead    1    7    0    7    00
55    Last    0    116    0        00
56    Once    0    67    0        00
57    OpenAutoindex    9    5    0    k(5,B,B,,,)    00
58    Rewind    1    67    0        00
59    Column    1    6    21        00
60    Column    1    0    22        00
61    Column    1    2    23        00
62    Column    1    5    24        00
63    Rowid    1    25    0        00
64    MakeRecord    21    5    20        00
65    IdxInsert    9    20    0        10
66    Next    1    59    0        03
67    Integer    0    26    0        00
68    Integer    1    27    0        00
69    Rowid    0    28    0        00
70    SeekGE    9    112    27    2    00
71    IdxGT    9    112    27    2    00
72    Integer    1    26    0        00
73    Column    9    0    20        00
74    Ne    29    111    20    (BINARY)    53
75    Integer    0    30    0        00
76    Rewind    2    108    0        00
77    Column    0    4    31        00
78    Column    2    0    32        00
79    Eq    32    82    31    (BINARY)    43
80    Rowid    0    33    0        00
81    Ne    32    107    33    (BINARY)    53
82    Integer    1    30    0        00
83    Column    9    2    31        00
84    Column    2    1    20        00
85    Lt    20    107    31    (BINARY)    53
86    Copy    20    34    0        00
87    Function0    0    34    33    datetime(-1)    01
88    Le    35    107    33        51
89    Once    0    99    0        00
90    OpenEphemeral    11    1    0    k(1,B)    00
91    OpenRead    12    9    0    k(2,,)    02
92    String8    0    36    0    example.com    00
93    SeekGE    12    98    36    1    00
94    IdxGT    12    98    36    1    00
95    IdxRowid    12    37    0        00
96    MakeRecord    37    1    31    C    00
97    IdxInsert    11    31    0        00
98    Close    12    0    0        00
99    Column    2    0    31        00
100    IsNull    31    107    0        00
101    Affinity    31    1    0    C    00
102    NotFound    11    107    31    1    00
103    Column    0    1    38        00
104    Column    9    3    39        00
105    ResultRow    38    2    0        00
106    DecrJumpZero    19    116    0        00
107    Next    2    77    0        01
108    IfPos    30    111    0        00
109    NullRow    2    0    0        00
110    Goto    0    82    0        00
111    Next    9    71    0        00
112    IfPos    26    115    0        00
113    NullRow    9    0    0        00
114    Goto    0    72    0        00
115    Prev    0    56    0        01
116    Close    0    0    0        00
117    Halt    0    0    0        00
118    Transaction    0    0    338    1    01
119    TableLock    0    7    0    lookups    00
120    TableLock    0    8    0    urls    00
121    Integer    1    29    0        00
122    String8    0    40    0    now    00
123    String8    0    41    0    -5 days    00
124    Function0    3    40    35    datetime(-1)    02
125    Goto    0    1    0        00


Any thoughts/suggestions/feedback welcome.
Many thanks,
Jonathan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Clemens Ladisch
I have not looked at the schema and queries in detail.
But at a first glance:

> CREATE VIEW v_most_recent_lookup_per_url AS
>      ...
>      ORDER BY url_id;

Drop the ORDER BY; it is useless in a view used in another query,
and just might slow things down.

> 0    1    1    SEARCH TABLE lookups AS l USING AUTOMATIC COVERING INDEX (is_generic_flag=? AND url_id=?)

If SQLite thinks that an AUTOMATIC index is useful, you should consider
creating it explicitly:
  CREATE INDEX lookup_generid_id_idx ON lookups(is_generic_flag, url_id);


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Tim Streater-3
In reply to this post by Jonathan Moules-3
On 18 Mar 2018, at 21:48, Jonathan Moules <[hidden email]> wrote:

> CREATE TABLE lookups (
>      url_id               INTEGER  REFERENCES urls (url_id),
>      retrieval_datetime   DATETIME,
>      error_code           INTEGER,
>      is_generic_flag      BOOLEAN  -- about one in 100 will have this
> flagged
> );

SQLite has no DATETIME or BOOLEAN datatype, see:

   <http://www.sqlite.org/datatype3.html>

Given this, I don't know what your attempts to compare date/times will do. Personally I store all date/times as seconds since the epoch; it simplifies life a lot.



--
Cheers  --  Tim
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Quan Yong Zhai
In reply to this post by Clemens Ladisch
Drop the view, add a new column “last_retrieval_datetime” to the table urls, and whenever fetch a page, update this column

发自我的 iPhone

> 在 2018年3月19日,06:15,Clemens Ladisch <[hidden email]> 写道:
>
> I have not looked at the schema and queries in detail.
> But at a first glance:
>
>> CREATE VIEW v_most_recent_lookup_per_url AS
>>     ...
>>     ORDER BY url_id;
>
> Drop the ORDER BY; it is useless in a view used in another query,
> and just might slow things down.
>
>> 0    1    1    SEARCH TABLE lookups AS l USING AUTOMATIC COVERING INDEX (is_generic_flag=? AND url_id=?)
>
> If SQLite thinks that an AUTOMATIC index is useful, you should consider
> creating it explicitly:
>  CREATE INDEX lookup_generid_id_idx ON lookups(is_generic_flag, url_id);
>
>
> Regards,
> Clemens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Jonathan Moules-3
In reply to this post by Tim Streater-3
Hi Both,
Thanks for your thoughts.

 > SQLite has no DATETIME or BOOLEAN datatype

I'm aware that DATETIME (and BOOLEAN) isn't an actual datatype in
SQLite, but it functions exactly as you'd expect - I've used them
extensively in the past and never had a problem - I'm actually storing
ISO8601 strings and so it's simply a synonym for the TEXT - I find
having human-readable datetimes infinitely more practical for
development purposes. It's one of the things I most like about SQLite -
datetimes are ridiculously easy compared to "proper" databases.

 > Drop the ORDER BY; it is useless in a view used in another query,
I figured sqlite would optimise it out if it didn't think it needed it
(it's needed by some of the other things that use that view). A quick
test seems to confirm it. It seems that the ORDER BY becomes the "3    
Noop    5    4    0        00 " line in the EXPLAIN report - otherwise
the explain is identical.

 > If SQLite thinks that an AUTOMATIC index is useful, you should
consider creating it explicitly: CREATE INDEX lookup_generid_id_idx ON
lookups(is_generic_flag, url_id);

It does make the EXPLAIN plan 6 lines shorter, but no apparent speed
difference unfortunately.

Given there's such a small amount of data at this point, I suspect the
issue is more related to the recursion. I've tried creating these two
indexes to facilicate that:
CREATE INDEX url_id_source_id_idx ON urls (

url_id,

source_seed_id

);


CREATE INDEX source_id_url_id_idx ON urls (

url_id,

source_seed_id

);


SQLite will use the former (url_id, then source_seed_id), but it makes
absolutely no difference to the speed.


So I'm still stumped.



On 2018-03-18 22:30, Tim Streater wrote:

> On 18 Mar 2018, at 21:48, Jonathan Moules <[hidden email]> wrote:
>
>> CREATE TABLE lookups (
>>       url_id               INTEGER  REFERENCES urls (url_id),
>>       retrieval_datetime   DATETIME,
>>       error_code           INTEGER,
>>       is_generic_flag      BOOLEAN  -- about one in 100 will have this
>> flagged
>> );
> SQLite has no DATETIME or BOOLEAN datatype, see:
>
>     <http://www.sqlite.org/datatype3.html>
>
> Given this, I don't know what your attempts to compare date/times will do. Personally I store all date/times as seconds since the epoch; it simplifies life a lot.
>
>
>


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Simon Slavin-3
On 18 Mar 2018, at 11:13pm, Jonathan Moules <[hidden email]> wrote:

> Given there's such a small amount of data at this point, I suspect the issue is more related to the recursion. I've tried creating these two indexes to facilicate that

Nice idea but I can see why it's not working.

You have an underlying problem: the format you're using to store your data makes it extremely difficult to extract the figures you want.  Quan Yong Zhai has the best idea I've seen: get rid of almost ¾ of the work you're doing by storing the last retrieval date in your "urls" table.

As an alternative, store the start your retrieval process by JOINing the two tables together.  Consider the result of this query

SELECT url_id, lookups.error_code
    FROM urls
    JOIN lookups ON lookups.url_id = urls.url_id AND lookup.retrieval_datetime = urls.retrieval_datetime
    WHERE DATETIME(urls.retrieval_datetime) > DATETIME('now', '-5 days')

and figure out what you would add to that to get your desired result.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Jonathan Moules-3
Thanks Simon and Quan.
I'm not sure it's the view itself per-se - It takes 0.000s (time too
small to measure) for just the full View to be run on this dataset.

It turns out the problem is simpler than that and no data changes are
needed. I did consider Quan Yong Zhai's option and gave it a try, but
couldn't fathom out the necessary SQL to get what I wanted (it's getting
late).

Instead prompted by the replies here, I've changed the query very
slightly to the below which solves the problem:

SELECT
         u.url, l.error_code
     FROM
         urls u
     LEFT JOIN
         lookups l
         USING(url_id)
     LEFT JOIN
         (select * from v_most_recent_lookup_per_url where url_id in (
             select url_id from urls where url = 'example.com'
             )) recent
     -- By definition url's can) recent
         -- This is the important bit
         -- Here we use the most recent lookup url_id to link to the
source_seed_id, so we only find its children
         -- Or alternatively itself
         ON u.source_seed_id = recent.url_id
             OR u.url_id = recent.url_id
     WHERE
         -- For JSON-spider at least, Generic's are guaranteed to be
generic pages.
         l.is_generic_flag = 1
         AND
         -- Must be "or equal to" so we can get the lookup of the very
base url.
         l.retrieval_datetime >= recent.retrieval_datetime
         AND
         DATETIME(recent.retrieval_datetime) > DATETIME('now', '-5 days')
     ORDER BY
         u.url_id DESC
     LIMIT 1;


To save readers having to compare manually, the difference is this: I
turned the "recent" alias item from the View into a subquery (still
using the view), and then moved the "where url = example.com" part in to
there.
The query is now literally two orders of magnitude faster, from 0.2s to
0.004s. No new indexes or anything, just that change.

Hopefully this will scale to full datasets; if it doesn't I may have to
consider the other suggestions, but for now this is a minimum-effort
solution.

I'm not actually sure what SQLite was doing in the previous query to
make it take so long. , so I imagine there was some hideous recursing
going on or something.

Scope for optimisation?

Thanks again,
Jonathan

On 2018-03-18 23:37, Simon Slavin wrote:

> On 18 Mar 2018, at 11:13pm, Jonathan Moules <[hidden email]> wrote:
>
>> Given there's such a small amount of data at this point, I suspect the issue is more related to the recursion. I've tried creating these two indexes to facilicate that
> Nice idea but I can see why it's not working.
>
> You have an underlying problem: the format you're using to store your data makes it extremely difficult to extract the figures you want.  Quan Yong Zhai has the best idea I've seen: get rid of almost ¾ of the work you're doing by storing the last retrieval date in your "urls" table.
>
> As an alternative, store the start your retrieval process by JOINing the two tables together.  Consider the result of this query
>
> SELECT url_id, lookups.error_code
>      FROM urls
>      JOIN lookups ON lookups.url_id = urls.url_id AND lookup.retrieval_datetime = urls.retrieval_datetime
>      WHERE DATETIME(urls.retrieval_datetime) > DATETIME('now', '-5 days')
>
> and figure out what you would add to that to get your desired result.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Query speed Regression: 3.15 much faster than 3.22 (Was: How to optimise a somewhat-recursive query? )

Jonathan Moules-3
Hi List,
So, I thought I'd solved my little problem, but upon testing it in my
application it subjectively didn't seem any faster.

I upgraded the SQLite in my IDE to 3.22, and it is confirming my
suspicions. It turns out that when I run the below in 3.22, it takes
about 0.150s. But in 3.15 it was taking 0.004s!

The original query also takes 0.15s in 3.22 - so that has been mildly
optimised (from ~0.2s). My general thinking-aloud notion is that my
"fix" is getting optimised away in 3.22.

I can provide a small replication database if desired.

Thanks,
Jonathan

On 2018-03-19 00:24, Jonathan Moules wrote:

> Thanks Simon and Quan.
> I'm not sure it's the view itself per-se - It takes 0.000s (time too
> small to measure) for just the full View to be run on this dataset.
>
> It turns out the problem is simpler than that and no data changes are
> needed. I did consider Quan Yong Zhai's option and gave it a try, but
> couldn't fathom out the necessary SQL to get what I wanted (it's
> getting late).
>
> Instead prompted by the replies here, I've changed the query very
> slightly to the below which solves the problem:
>
> SELECT
>         u.url, l.error_code
>     FROM
>         urls u
>     LEFT JOIN
>         lookups l
>         USING(url_id)
>     LEFT JOIN
>         (select * from v_most_recent_lookup_per_url where url_id in (
>             select url_id from urls where url = 'example.com'
>             )) recent
>     -- By definition url's can) recent
>         -- This is the important bit
>         -- Here we use the most recent lookup url_id to link to the
> source_seed_id, so we only find its children
>         -- Or alternatively itself
>         ON u.source_seed_id = recent.url_id
>             OR u.url_id = recent.url_id
>     WHERE
>         -- For JSON-spider at least, Generic's are guaranteed to be
> generic pages.
>         l.is_generic_flag = 1
>         AND
>         -- Must be "or equal to" so we can get the lookup of the very
> base url.
>         l.retrieval_datetime >= recent.retrieval_datetime
>         AND
>         DATETIME(recent.retrieval_datetime) > DATETIME('now', '-5 days')
>     ORDER BY
>         u.url_id DESC
>     LIMIT 1;
>
>
> To save readers having to compare manually, the difference is this: I
> turned the "recent" alias item from the View into a subquery (still
> using the view), and then moved the "where url = example.com" part in
> to there.
> The query is now literally two orders of magnitude faster, from 0.2s
> to 0.004s. No new indexes or anything, just that change.
>
> Hopefully this will scale to full datasets; if it doesn't I may have
> to consider the other suggestions, but for now this is a
> minimum-effort solution.
>
> I'm not actually sure what SQLite was doing in the previous query to
> make it take so long. , so I imagine there was some hideous recursing
> going on or something.
>
> Scope for optimisation?
>
> Thanks again,
> Jonathan
>
> On 2018-03-18 23:37, Simon Slavin wrote:
>> On 18 Mar 2018, at 11:13pm, Jonathan Moules
>> <[hidden email]> wrote:
>>
>>> Given there's such a small amount of data at this point, I suspect
>>> the issue is more related to the recursion. I've tried creating
>>> these two indexes to facilicate that
>> Nice idea but I can see why it's not working.
>>
>> You have an underlying problem: the format you're using to store your
>> data makes it extremely difficult to extract the figures you want.  
>> Quan Yong Zhai has the best idea I've seen: get rid of almost ¾ of
>> the work you're doing by storing the last retrieval date in your
>> "urls" table.
>>
>> As an alternative, store the start your retrieval process by JOINing
>> the two tables together.  Consider the result of this query
>>
>> SELECT url_id, lookups.error_code
>>      FROM urls
>>      JOIN lookups ON lookups.url_id = urls.url_id AND
>> lookup.retrieval_datetime = urls.retrieval_datetime
>>      WHERE DATETIME(urls.retrieval_datetime) > DATETIME('now', '-5
>> days')
>>
>> and figure out what you would add to that to get your desired result.
>>
>> Simon.
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query speed Regression: 3.15 much faster than 3.22 (Was: How to optimise a somewhat-recursive query? )

Richard Hipp-3
On 3/18/18, Jonathan Moules <[hidden email]> wrote:
>
> I can provide a small replication database if desired.

Please do.  Send it as an attachment directly to me.  Also please send
the exact text of the query that is running slowly.


--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Jonathan Moules-3
In reply to this post by Jonathan Moules-3
So, I'm back to being stuck on this.
I have inserted 500,000 random urls (no extra lookups - still just
1000), and now the query (as per the below reply) is back to being
somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.

After a couple of hours of investigation, it's only slow when there is
either no ORDER BY, or if I use DESC (which is what I need). If I use
ORDER BY u.url ASC - it's near instantaneous.

I've tried every possible combination of indexes I can think up,
including of course with url_id DESC. I've also removed the ORDER BY in
the view (and set it to DESC as well), but that made absolutely no
difference.

I'm a little confused as to why I'm seeing this behaviour - my limited
understanding of the query-planner and reading
https://sqlite.org/queryplanner.html - suggests that at least when using
indexes, it'll simply scan an index backwards if that looks like it'll
help. I appreciate this is a result-set not an index, but in this case
could not the query planner realise that reading through the results
backwards would be faster than whatever it's doing?

And for that matter, shouldn't no ORDER BY be at least the same speed as
ORDER BY u.url_id ASC?

Thoughts welcome; Thanks!
Jonathan


On 2018-03-19 00:24, Jonathan Moules wrote:

> Thanks Simon and Quan.
> I'm not sure it's the view itself per-se - It takes 0.000s (time too
> small to measure) for just the full View to be run on this dataset.
>
> It turns out the problem is simpler than that and no data changes are
> needed. I did consider Quan Yong Zhai's option and gave it a try, but
> couldn't fathom out the necessary SQL to get what I wanted (it's
> getting late).
>
> Instead prompted by the replies here, I've changed the query very
> slightly to the below which solves the problem:
>
> SELECT
>         u.url, l.error_code
>     FROM
>         urls u
>     LEFT JOIN
>         lookups l
>         USING(url_id)
>     LEFT JOIN
>         (select * from v_most_recent_lookup_per_url where url_id in (
>             select url_id from urls where url = 'example.com'
>             )) recent
>     -- By definition url's can) recent
>         -- This is the important bit
>         -- Here we use the most recent lookup url_id to link to the
> source_seed_id, so we only find its children
>         -- Or alternatively itself
>         ON u.source_seed_id = recent.url_id
>             OR u.url_id = recent.url_id
>     WHERE
>         -- For JSON-spider at least, Generic's are guaranteed to be
> generic pages.
>         l.is_generic_flag = 1
>         AND
>         -- Must be "or equal to" so we can get the lookup of the very
> base url.
>         l.retrieval_datetime >= recent.retrieval_datetime
>         AND
>         DATETIME(recent.retrieval_datetime) > DATETIME('now', '-5 days')
>     ORDER BY
>         u.url_id DESC
>     LIMIT 1;
>
>
> To save readers having to compare manually, the difference is this: I
> turned the "recent" alias item from the View into a subquery (still
> using the view), and then moved the "where url = example.com" part in
> to there.
> The query is now literally two orders of magnitude faster, from 0.2s
> to 0.004s. No new indexes or anything, just that change.
>
> Hopefully this will scale to full datasets; if it doesn't I may have
> to consider the other suggestions, but for now this is a
> minimum-effort solution.
>
> I'm not actually sure what SQLite was doing in the previous query to
> make it take so long. , so I imagine there was some hideous recursing
> going on or something.
>
> Scope for optimisation?
>
> Thanks again,
> Jonathan


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Richard Hipp-3
On 3/21/18, Jonathan Moules <[hidden email]> wrote:
> So, I'm back to being stuck on this.
> I have inserted 500,000 random urls (no extra lookups - still just
> 1000), and now the query (as per the below reply) is back to being
> somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.

Do you have the ability to compile SQLite from canonical sources?  If
so, please try again with the tip of the join-strength-reduction
branch (https://www.sqlite.org/src/timeline?r=join-strength-reduction).

To compile on Windows:

(1) Download a tarball or ZIP archive (or SQLite Archive) and unpack it.
(2) Run "nmake /f makefile.msc sqlite3.c"

On unix:

(1) Download and unpack as before
(2) ./configure; make sqlite3.c

The only dependence for the above is having a "tclsh" somewhere on your $PATH.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Simon Slavin-3
In reply to this post by Jonathan Moules-3
On 21 Mar 2018, at 5:39pm, Jonathan Moules <[hidden email]> wrote:

> After a couple of hours of investigation, it's only slow when there is either no ORDER BY, or if I use DESC (which is what I need). If I use ORDER BY u.url ASC - it's near instantaneous.

You have a query which gets faster when you add an ORDER BY clause ?  That's not meant to happen.

Are you using an up-to-date version of SQLite ?

Are you executing the query using the sqlite shell tool ?

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Eduardo
In reply to this post by Jonathan Moules-3
On Wed, 21 Mar 2018 17:39:45 +0000
Jonathan Moules <[hidden email]> escribió:

> So, I'm back to being stuck on this.
> I have inserted 500,000 random urls (no extra lookups - still just
> 1000), and now the query (as per the below reply) is back to being
> somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.

After the huge insert, did you run ANALYZE?
 
> After a couple of hours of investigation, it's only slow when there is
> either no ORDER BY, or if I use DESC (which is what I need). If I use
> ORDER BY u.url ASC - it's near instantaneous.
>
> I've tried every possible combination of indexes I can think up,
> including of course with url_id DESC. I've also removed the ORDER BY in
> the view (and set it to DESC as well), but that made absolutely no
> difference.

Have your indexes in the last row the primary key or rowid?
 

> I'm a little confused as to why I'm seeing this behaviour - my limited
> understanding of the query-planner and reading
> https://sqlite.org/queryplanner.html - suggests that at least when using
> indexes, it'll simply scan an index backwards if that looks like it'll
> help. I appreciate this is a result-set not an index, but in this case
> could not the query planner realise that reading through the results
> backwards would be faster than whatever it's doing?
>
> And for that matter, shouldn't no ORDER BY be at least the same speed as
> ORDER BY u.url_id ASC?

What does an integrity check returns?

Try a Reindex and Analyze, all cases should be similar fast (or slow).

> Thoughts welcome; Thanks!
> Jonathan
>
>
> On 2018-03-19 00:24, Jonathan Moules wrote:
> > Thanks Simon and Quan.
> > I'm not sure it's the view itself per-se - It takes 0.000s (time too
> > small to measure) for just the full View to be run on this dataset.
> >
> > It turns out the problem is simpler than that and no data changes are
> > needed. I did consider Quan Yong Zhai's option and gave it a try, but
> > couldn't fathom out the necessary SQL to get what I wanted (it's
> > getting late).
> >
> > Instead prompted by the replies here, I've changed the query very
> > slightly to the below which solves the problem:
> >
> > SELECT
> >         u.url, l.error_code
> >     FROM
> >         urls u
> >     LEFT JOIN
> >         lookups l
> >         USING(url_id)
> >     LEFT JOIN
> >         (select * from v_most_recent_lookup_per_url where url_id in (
> >             select url_id from urls where url = 'example.com'
> >             )) recent
> >     -- By definition url's can) recent
> >         -- This is the important bit
> >         -- Here we use the most recent lookup url_id to link to the
> > source_seed_id, so we only find its children
> >         -- Or alternatively itself
> >         ON u.source_seed_id = recent.url_id
> >             OR u.url_id = recent.url_id
> >     WHERE
> >         -- For JSON-spider at least, Generic's are guaranteed to be
> > generic pages.
> >         l.is_generic_flag = 1
> >         AND
> >         -- Must be "or equal to" so we can get the lookup of the very
> > base url.
> >         l.retrieval_datetime >= recent.retrieval_datetime
> >         AND
> >         DATETIME(recent.retrieval_datetime) > DATETIME('now', '-5 days')
> >     ORDER BY
> >         u.url_id DESC
> >     LIMIT 1;
> >
> >
> > To save readers having to compare manually, the difference is this: I
> > turned the "recent" alias item from the View into a subquery (still
> > using the view), and then moved the "where url = example.com" part in
> > to there.
> > The query is now literally two orders of magnitude faster, from 0.2s
> > to 0.004s. No new indexes or anything, just that change.
> >
> > Hopefully this will scale to full datasets; if it doesn't I may have
> > to consider the other suggestions, but for now this is a
> > minimum-effort solution.
> >
> > I'm not actually sure what SQLite was doing in the previous query to
> > make it take so long. , so I imagine there was some hideous recursing
> > going on or something.
> >
> > Scope for optimisation?
> >
> > Thanks again,
> > Jonathan
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


--
Eduardo <[hidden email]>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Jonathan Moules-3
In reply to this post by Richard Hipp-3
Hi Richard, Simon
Re: Compiling - I'm afraid I wouldn't really know where to begin. A
quick google finds
https://superuser.com/questions/146577/where-do-i-find-nmake-for-windows-7-x64 
- but the answers/links there don't seem to work. I've got to go-out now
but can take another look later and see if I can find a copy (Microsoft
(I'm on Windows) never make it easy to find stuff).

Simon - I suspected the ORDER BY thing was wrong but wanted to check
first rather than simply come out with "SQLite is broken!". This may be
related to the 3.22 regression I brought up a couple of days ago (and
why I'm using 3.15) - probably why Dr H is suggesting I try his branch.
I'm executing the query using SQLiteStudio (Or Python).
Thanks,
Jonathan


On 2018-03-21 17:58, Richard Hipp wrote:

> On 3/21/18, Jonathan Moules <[hidden email]> wrote:
>> So, I'm back to being stuck on this.
>> I have inserted 500,000 random urls (no extra lookups - still just
>> 1000), and now the query (as per the below reply) is back to being
>> somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.
> Do you have the ability to compile SQLite from canonical sources?  If
> so, please try again with the tip of the join-strength-reduction
> branch (https://www.sqlite.org/src/timeline?r=join-strength-reduction).
>
> To compile on Windows:
>
> (1) Download a tarball or ZIP archive (or SQLite Archive) and unpack it.
> (2) Run "nmake /f makefile.msc sqlite3.c"
>
> On unix:
>
> (1) Download and unpack as before
> (2) ./configure; make sqlite3.c
>
> The only dependence for the above is having a "tclsh" somewhere on your $PATH.
>


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

David Raymond
To at least try increasing your version from 3.15 to 3.22 you can get pre-compiled binaries from the http://www.sqlite.org/download.html page. Go to the Precompiled Binaries for Windows section and there's a sqlite-tools which includes the Command Line Interface, and .dll's which you can put in your ...\PythonXX\DLLs (replacing the sqlite3.dll which is in there at the moment)

Then give it a whirl with either the CLI or the new dll in Python.

In addition you can run "explain query plan blah" or the more detailed "explain blah" to get a view of what it's doing when it executes the statement "blah"


-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Jonathan Moules
Sent: Wednesday, March 21, 2018 2:17 PM
To: [hidden email]
Subject: Re: [sqlite] How to optimise a somewhat-recursive query?

Hi Richard, Simon
Re: Compiling - I'm afraid I wouldn't really know where to begin. A
quick google finds
https://superuser.com/questions/146577/where-do-i-find-nmake-for-windows-7-x64 
- but the answers/links there don't seem to work. I've got to go-out now
but can take another look later and see if I can find a copy (Microsoft
(I'm on Windows) never make it easy to find stuff).

Simon - I suspected the ORDER BY thing was wrong but wanted to check
first rather than simply come out with "SQLite is broken!". This may be
related to the 3.22 regression I brought up a couple of days ago (and
why I'm using 3.15) - probably why Dr H is suggesting I try his branch.
I'm executing the query using SQLiteStudio (Or Python).
Thanks,
Jonathan


On 2018-03-21 17:58, Richard Hipp wrote:

> On 3/21/18, Jonathan Moules <[hidden email]> wrote:
>> So, I'm back to being stuck on this.
>> I have inserted 500,000 random urls (no extra lookups - still just
>> 1000), and now the query (as per the below reply) is back to being
>> somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.
> Do you have the ability to compile SQLite from canonical sources?  If
> so, please try again with the tip of the join-strength-reduction
> branch (https://www.sqlite.org/src/timeline?r=join-strength-reduction).
>
> To compile on Windows:
>
> (1) Download a tarball or ZIP archive (or SQLite Archive) and unpack it.
> (2) Run "nmake /f makefile.msc sqlite3.c"
>
> On unix:
>
> (1) Download and unpack as before
> (2) ./configure; make sqlite3.c
>
> The only dependence for the above is having a "tclsh" somewhere on your $PATH.
>


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Simon Slavin-3
In reply to this post by Jonathan Moules-3
On 21 Mar 2018, at 6:17pm, Jonathan Moules <[hidden email]> wrote:

> Simon - I suspected the ORDER BY thing was wrong but wanted to check first rather than simply come out with "SQLite is broken!". This may be related to the 3.22 regression I brought up a couple of days ago (and why I'm using 3.15) - probably why Dr H is suggesting I try his branch.
> I'm executing the query using SQLiteStudio (Or Python).

It is not supposed to be possible to speed up a query by adding an ORDER BY clause.  The very worst that SQLite is meant to do is pick a fast order for the results.  If the only difference between the two queries really is an ORDER BY clause, it looks like you've found something unexpected in SQLite.

I don't know what operating system you're using, but I hope Dr H will pilot you through a way to provide the developers with a way to reproduce this problem.

You can provide some useful information by using the precompiled binary of the sqlite shell tool to give the output from putting "EXPLAIN QUERY PLAN" before your SELECT commands:

EXPLAIN QUERY PLAN SELECT
       u.url, l.error_code <etc.>

The difference between the two might provide some clues about what's going on.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Keith Medcalf
In reply to this post by Jonathan Moules-3

Or just try it with the superfluous outer join keyword (LEFT) removed since you are really just doing an inner (equi) join and the outer join data is just discarded (by your WHERE clause constraints) after it is generated anyway ...

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Jonathan Moules
>Sent: Wednesday, 21 March, 2018 12:17
>To: [hidden email]
>Subject: Re: [sqlite] How to optimise a somewhat-recursive query?
>
>Hi Richard, Simon
>Re: Compiling - I'm afraid I wouldn't really know where to begin. A
>quick google finds
>https://superuser.com/questions/146577/where-do-i-find-nmake-for-
>windows-7-x64
>- but the answers/links there don't seem to work. I've got to go-out
>now
>but can take another look later and see if I can find a copy
>(Microsoft
>(I'm on Windows) never make it easy to find stuff).
>
>Simon - I suspected the ORDER BY thing was wrong but wanted to check
>first rather than simply come out with "SQLite is broken!". This may
>be
>related to the 3.22 regression I brought up a couple of days ago (and
>why I'm using 3.15) - probably why Dr H is suggesting I try his
>branch.
>I'm executing the query using SQLiteStudio (Or Python).
>Thanks,
>Jonathan
>
>
>On 2018-03-21 17:58, Richard Hipp wrote:
>> On 3/21/18, Jonathan Moules <[hidden email]> wrote:
>>> So, I'm back to being stuck on this.
>>> I have inserted 500,000 random urls (no extra lookups - still just
>>> 1000), and now the query (as per the below reply) is back to being
>>> somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.
>> Do you have the ability to compile SQLite from canonical sources?
>If
>> so, please try again with the tip of the join-strength-reduction
>> branch (https://www.sqlite.org/src/timeline?r=join-strength-
>reduction).
>>
>> To compile on Windows:
>>
>> (1) Download a tarball or ZIP archive (or SQLite Archive) and
>unpack it.
>> (2) Run "nmake /f makefile.msc sqlite3.c"
>>
>> On unix:
>>
>> (1) Download and unpack as before
>> (2) ./configure; make sqlite3.c
>>
>> The only dependence for the above is having a "tclsh" somewhere on
>your $PATH.
>>
>
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Keith Medcalf

and Richards patch merely attempts to detect such errors of query phrasing and convert the join type for you ... so that there is no need to generate the possibly millions of unnecessary intermediate results ...

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Keith Medcalf
>Sent: Wednesday, 21 March, 2018 12:31
>To: SQLite mailing list
>Subject: Re: [sqlite] How to optimise a somewhat-recursive query?
>
>
>Or just try it with the superfluous outer join keyword (LEFT) removed
>since you are really just doing an inner (equi) join and the outer
>join data is just discarded (by your WHERE clause constraints) after
>it is generated anyway ...
>
>---
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>
>>-----Original Message-----
>>From: sqlite-users [mailto:sqlite-users-
>>[hidden email]] On Behalf Of Jonathan Moules
>>Sent: Wednesday, 21 March, 2018 12:17
>>To: [hidden email]
>>Subject: Re: [sqlite] How to optimise a somewhat-recursive query?
>>
>>Hi Richard, Simon
>>Re: Compiling - I'm afraid I wouldn't really know where to begin. A
>>quick google finds
>>https://superuser.com/questions/146577/where-do-i-find-nmake-for-
>>windows-7-x64
>>- but the answers/links there don't seem to work. I've got to go-out
>>now
>>but can take another look later and see if I can find a copy
>>(Microsoft
>>(I'm on Windows) never make it easy to find stuff).
>>
>>Simon - I suspected the ORDER BY thing was wrong but wanted to check
>>first rather than simply come out with "SQLite is broken!". This may
>>be
>>related to the 3.22 regression I brought up a couple of days ago
>(and
>>why I'm using 3.15) - probably why Dr H is suggesting I try his
>>branch.
>>I'm executing the query using SQLiteStudio (Or Python).
>>Thanks,
>>Jonathan
>>
>>
>>On 2018-03-21 17:58, Richard Hipp wrote:
>>> On 3/21/18, Jonathan Moules <[hidden email]> wrote:
>>>> So, I'm back to being stuck on this.
>>>> I have inserted 500,000 random urls (no extra lookups - still
>just
>>>> 1000), and now the query (as per the below reply) is back to
>being
>>>> somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.
>>> Do you have the ability to compile SQLite from canonical sources?
>>If
>>> so, please try again with the tip of the join-strength-reduction
>>> branch (https://www.sqlite.org/src/timeline?r=join-strength-
>>reduction).
>>>
>>> To compile on Windows:
>>>
>>> (1) Download a tarball or ZIP archive (or SQLite Archive) and
>>unpack it.
>>> (2) Run "nmake /f makefile.msc sqlite3.c"
>>>
>>> On unix:
>>>
>>> (1) Download and unpack as before
>>> (2) ./configure; make sqlite3.c
>>>
>>> The only dependence for the above is having a "tclsh" somewhere on
>>your $PATH.
>>>
>>
>>
>>_______________________________________________
>>sqlite-users mailing list
>>[hidden email]
>>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Jonathan Moules-3
In reply to this post by Richard Hipp-3
Hi Richard,
I've spent the last ~90 minutes trying to build this but to no avail I'm
afraid. I'm unable to find a version of nmake for Windows 7 (It seems to
be a VS thing and that in turn is Windows 8/10 only). Then I tried
inside a VM of Mint, managed to build it, and whilst I was trying to
find some sort of sqlite front-end (I know SQLite has a CLI tool, but
I'm happier with front-ends), the VM crashed (for the second time)!

So I'm afraid self-helping by trying that branch isn't happening. If you
have the means to readily build one and send it to me off-list, feel
free to and I'll happily test it. If you can't trust the creator of
SQLite who can you trust? :-)

---

The good news is that Keith's suggestion of removing the "LEFT" from the
JOIN fixed the ORDER BY DESC issue. The database and query are the same
as the one's I provided you, just with a bunch of semi-random data in
the "urls" table.

---

However, - I've now inserted 100,000 semi-random entries into the
"lookups" table. If I run the same query again (which is unchanged
except removing the LEFT's from in front of the JOINs), it's going slow
again taking about 0.5s.
If I change the ORDER BY to ASC, it's also about 0.5s - so they're
consistent.

That's with this index added which seems to be its preference from the
numerous variants I created:

CREATE INDEX url_id_datetime_idx ON lookups (

url_id DESC,

retrieval_datetime

);


The things you're likely interested in though, and they may or may not
be addressed by your branch:
a) If I remove the ORDER BY and LIMIT, the query takes 15 (fifteen) seconds!

b) And if I add the LEFT back in front of the JOIN's, the ORDER BY ASC
query is back to being modestly speedy - 0.07s - and with no ORDER BY
it's the same as well.

c) But with the LEFT JOIN's the query takes about 1.1s for ORDER BY DESC

I can provide another copy of the database with the new data in if you
wish. Or test the fix if you have a dll you want to send me off list.
Thanks,
Jonathan





On 2018-03-21 17:58, Richard Hipp wrote:

> On 3/21/18, Jonathan Moules <[hidden email]> wrote:
>> So, I'm back to being stuck on this.
>> I have inserted 500,000 random urls (no extra lookups - still just
>> 1000), and now the query (as per the below reply) is back to being
>> somewhat slow (I'm using sqlite 3.15) at about 0.6s per request.
> Do you have the ability to compile SQLite from canonical sources?  If
> so, please try again with the tip of the join-strength-reduction
> branch (https://www.sqlite.org/src/timeline?r=join-strength-reduction).
>
> To compile on Windows:
>
> (1) Download a tarball or ZIP archive (or SQLite Archive) and unpack it.
> (2) Run "nmake /f makefile.msc sqlite3.c"
>
> On unix:
>
> (1) Download and unpack as before
> (2) ./configure; make sqlite3.c
>
> The only dependence for the above is having a "tclsh" somewhere on your $PATH.
>


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to optimise a somewhat-recursive query?

Simon Slavin-3
On 21 Mar 2018, at 10:58pm, Jonathan Moules <[hidden email]> wrote:

> I know SQLite has a CLI tool, but I'm happier with front-ends

You can use the CLI tool (which does still work under Windows 7) to open the database you prepared in your preferred environment and execute just the statement you're interested in.  You don't have to do the whole task in the CLI.

The advantage of using the CLI is that it doesn't involve any conversions to or from the SQLite API.  The SQLite development team wrote the whole thing and can completely understand the behaviour of everything it does.

You should work with a copy of the database, not the original, of course.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
12