Query performance slower in 3.25

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Query performance slower in 3.25

Stephen F. Booth
A query that ran fine under SQLite 3.24 is substantially slower in 3.25:

SQLite version 3.24.0 2018-06-04 19:24:41
> .timer on
> select a.id from a join c on a.id = case when c.b_a_name is not null then
c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
a_fts match 'r*' order by rank) group by a.id;
Run Time: real 0.037 user 0.019868 sys 0.016376

SQLite version 3.25.0 2018-09-15 04:01:47
> .timer on
> select a.id from a join c on a.id = case when c.b_a_name is not null then
c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
a_fts match 'r*' order by rank) group by a.id;
Run Time: real 4.525 user 2.055779 sys 2.466143

Performance in 3.25.1 and 3.25.2 is similar to 3.25.0.

The bottleneck seems to be in the CASE portion of the query. I am not sure
why.

To try and make a valid comparison I compiled the versions identically
using the following flags:

% gcc shell.c sqlite3.c -DSQLITE_THREADSAFE=0 -DSQLITE_DEFAULT_MEMSTATUS=0
-DSQLITE_DEFAULT_WAL_SYNCHRONOUS=1 -DSQLITE_LIKE_DOESNT_MATCH_BLOBS
-DSQLITE_MAX_EXPR_DEPTH=0 -DSQLITE_OMIT_DECLTYPE -DSQLITE_OMIT_DEPRECATED
-DSQLITE_OMIT_PROGRESS_CALLBACK -DSQLITE_OMIT_SHARED_CACHE
-DSQLITE_USE_ALLOCA=1 -DSQLITE_ENABLE_FTS5 -DSQLITE_ENABLE_RTREE
-DSQLITE_ENABLE_STAT4 -DHAVE_READLINE -DHAVE_USLEEP -lreadline -Os

Here is the compiler info:
Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: x86_64-apple-darwin18.2.0
Thread model: posix

What could be causing this performance hit in 3.25?

Thanks,
Stephen
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query performance slower in 3.25

Richard Hipp-3
Could you please provide us with the database schema, or perhaps even
a short script that demonstrates your problem, so that we can try to
debug it?

On 9/29/18, Stephen F. Booth <[hidden email]> wrote:

> A query that ran fine under SQLite 3.24 is substantially slower in 3.25:
>
> SQLite version 3.24.0 2018-06-04 19:24:41
>> .timer on
>> select a.id from a join c on a.id = case when c.b_a_name is not null then
> c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
> a_fts match 'r*' order by rank) group by a.id;
> Run Time: real 0.037 user 0.019868 sys 0.016376
>
> SQLite version 3.25.0 2018-09-15 04:01:47
>> .timer on
>> select a.id from a join c on a.id = case when c.b_a_name is not null then
> c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
> a_fts match 'r*' order by rank) group by a.id;
> Run Time: real 4.525 user 2.055779 sys 2.466143
>
> Performance in 3.25.1 and 3.25.2 is similar to 3.25.0.
>
> The bottleneck seems to be in the CASE portion of the query. I am not sure
> why.
>
> To try and make a valid comparison I compiled the versions identically
> using the following flags:
>
> % gcc shell.c sqlite3.c -DSQLITE_THREADSAFE=0 -DSQLITE_DEFAULT_MEMSTATUS=0
> -DSQLITE_DEFAULT_WAL_SYNCHRONOUS=1 -DSQLITE_LIKE_DOESNT_MATCH_BLOBS
> -DSQLITE_MAX_EXPR_DEPTH=0 -DSQLITE_OMIT_DECLTYPE -DSQLITE_OMIT_DEPRECATED
> -DSQLITE_OMIT_PROGRESS_CALLBACK -DSQLITE_OMIT_SHARED_CACHE
> -DSQLITE_USE_ALLOCA=1 -DSQLITE_ENABLE_FTS5 -DSQLITE_ENABLE_RTREE
> -DSQLITE_ENABLE_STAT4 -DHAVE_READLINE -DHAVE_USLEEP -lreadline -Os
>
> Here is the compiler info:
> Apple LLVM version 10.0.0 (clang-1000.11.45.2)
> Target: x86_64-apple-darwin18.2.0
> Thread model: posix
>
> What could be causing this performance hit in 3.25?
>
> Thanks,
> Stephen
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query performance slower in 3.25

Richard Hipp-3
In reply to this post by Stephen F. Booth
On 9/29/18, Stephen F. Booth <[hidden email]> wrote:
> A query that ran fine under SQLite 3.24 is substantially slower in 3.25:

Thanks for the data sent off-list....

Your work-around is to add a plus sign "+" before the "a.id" in the
GROUP BY clause.  (And, BTW, shouldn't that really be an ORDER BY
clause instead of a GROUP BY?)

select a.id from a join c on a.id = case when c.b_a_name is not null then
c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts where
a_fts match 'r*' order by rank) group by +a.id;

The change is the single "+" near the end.  This should be work for
you on all releases.

This is an interesting query planner problem.  Recall that a query
planner is really a kind of AI that has to infer or guess the best
query algorithm based on incomplete information.  In this particular
case, the AI is making a bad choice.  It will take some time for us to
figure out why and perhaps come up with an improvement.

Even in prior releases (such as 3.24.0) the AI was very very close to
making a bad choice.  A single minor tweak in one of the weights
pushed the decision threshold over a limit and caused that bad choice
to be taken.  So the problem has been lurking just under the surface
for a long time, apparently.  The minor tweak in 3.25.0
(https://sqlite.org/src/info/85b9be) merely brought the problem to the
surface.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Query performance slower in 3.25

Stephen F. Booth
On Sat, Sep 29, 2018 at 5:33 PM Richard Hipp <[hidden email]> wrote:

> On 9/29/18, Stephen F. Booth <[hidden email]> wrote:
> > A query that ran fine under SQLite 3.24 is substantially slower in 3.25:
>
> Thanks for the data sent off-list....
>
> Your work-around is to add a plus sign "+" before the "a.id" in the
> GROUP BY clause.  (And, BTW, shouldn't that really be an ORDER BY
> clause instead of a GROUP BY?)
>
> select a.id from a join c on a.id = case when c.b_a_name is not null then
> c.b_a_id else c.a_id end where a.id in (select a_fts.rowid from a_fts
> where
> a_fts match 'r*' order by rank) group by +a.id;
>
> The change is the single "+" near the end.  This should be work for
> you on all releases.
>

This does fix the issue. Thanks!

I'm sure there is a smarter way to write this query- you're probably right
about ORDER BY instead of GROUP BY.

Stephen


> This is an interesting query planner problem.  Recall that a query
> planner is really a kind of AI that has to infer or guess the best
> query algorithm based on incomplete information.  In this particular
> case, the AI is making a bad choice.  It will take some time for us to
> figure out why and perhaps come up with an improvement.
>
> Even in prior releases (such as 3.24.0) the AI was very very close to
> making a bad choice.  A single minor tweak in one of the weights
> pushed the decision threshold over a limit and caused that bad choice
> to be taken.  So the problem has been lurking just under the surface
> for a long time, apparently.  The minor tweak in 3.25.0
> (https://sqlite.org/src/info/85b9be) merely brought the problem to the
> surface.
>
> --
> D. Richard Hipp
> [hidden email]
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users