count optimisation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

count optimisation

curmudgeon
select count(*) from TblA cross join TblB

is over 200 times slower than

select (select count(*) from TblA) * (select count(*) from TblB);

Not that it matters to me. I came across it by accident and mention it only in case it’s a missed optimisation.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] count optimisation

Hick Gunter
The statement "select count() from <table>" is optimized to retrieve the count without visiting each row. This is well documented behaviour.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von x
Gesendet: Donnerstag, 15. März 2018 11:20
An: [hidden email]
Betreff: [EXTERNAL] [sqlite] count optimisation

select count(*) from TblA cross join TblB

is over 200 times slower than

select (select count(*) from TblA) * (select count(*) from TblB);

Not that it matters to me. I came across it by accident and mention it only in case it’s a missed optimisation.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] count optimisation

curmudgeon
You’re right Gunter. I think it’s been pointed out to me before as well. Doh.



________________________________
From: sqlite-users <[hidden email]> on behalf of Hick Gunter <[hidden email]>
Sent: Thursday, March 15, 2018 10:32:20 AM
To: 'SQLite mailing list'
Subject: Re: [sqlite] [EXTERNAL] count optimisation

The statement "select count() from <table>" is optimized to retrieve the count without visiting each row. This is well documented behaviour.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von x
Gesendet: Donnerstag, 15. März 2018 11:20
An: [hidden email]
Betreff: [EXTERNAL] [sqlite] count optimisation

select count(*) from TblA cross join TblB

is over 200 times slower than

select (select count(*) from TblA) * (select count(*) from TblB);

Not that it matters to me. I came across it by accident and mention it only in case it’s a missed optimisation.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: count optimisation

R Smith-2
In reply to this post by curmudgeon
On 2018/03/15 12:20 PM, x wrote:
> select count(*) from TblA cross join TblB
>
> is over 200 times slower than
>
> select (select count(*) from TblA) * (select count(*) from TblB);

This is a human-level optimisation, it's not efficient for the database
engine to do the optimisation.

By human-level I mean it is in the same category as knowing that
[((x + 1) / 50) * 100] - 2x] / 2
always evaluates to exactly 1  for all values of x.

Replacing all that formula with *1* when compiling as an optimisation
WILL WORK most definitely, but the number of such formulas we can come
up with is infinite, there is no point having to ask the compiler to
handle any one of those infinite variations as "special".

Another silly example is a Query of the form:

select American_President from [Any Table];

Which we can currently simply optimise with:

select 'Donald Trump';

but you don't want the query engine to be doing that. Ever.


To apply all of this to your specific case, how about if the query was
in stead:

select count(*), avg(col1) from TblA cross join TblB

or indeed

select count(*) from TblA, TblB CROSS JOIN TblC JOIN TblD ON 1=1


All these /can/ be optimised, but should be optimised by the programmer,
not the Query engine.


Cheers,
Ryan


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: count optimisation

curmudgeon
While I take your point Ryan I think some of the examples are far removed from the case I highlighted. I mean if count(*) is the only query column and the RHS of the joins is ‘cross join t’ the optimisation would be simple. If the remaining joins end ‘cross join t2’ it could be optimised further and so on.



Is it worth sqlite checking for such specific cases? Very probably not. Like I said it doesn’t matter to me, I pointed it out just in case.



________________________________
From: sqlite-users <[hidden email]> on behalf of R Smith <[hidden email]>
Sent: Thursday, March 15, 2018 11:25:18 AM
To: [hidden email]
Subject: Re: [sqlite] count optimisation

On 2018/03/15 12:20 PM, x wrote:
> select count(*) from TblA cross join TblB
>
> is over 200 times slower than
>
> select (select count(*) from TblA) * (select count(*) from TblB);

This is a human-level optimisation, it's not efficient for the database
engine to do the optimisation.

By human-level I mean it is in the same category as knowing that
[((x + 1) / 50) * 100] - 2x] / 2
always evaluates to exactly 1  for all values of x.

Replacing all that formula with *1* when compiling as an optimisation
WILL WORK most definitely, but the number of such formulas we can come
up with is infinite, there is no point having to ask the compiler to
handle any one of those infinite variations as "special".

Another silly example is a Query of the form:

select American_President from [Any Table];

Which we can currently simply optimise with:

select 'Donald Trump';

but you don't want the query engine to be doing that. Ever.


To apply all of this to your specific case, how about if the query was
in stead:

select count(*), avg(col1) from TblA cross join TblB

or indeed

select count(*) from TblA, TblB CROSS JOIN TblC JOIN TblD ON 1=1


All these /can/ be optimised, but should be optimised by the programmer,
not the Query engine.


Cheers,
Ryan


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users