RTrees and query speed

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

RTrees and query speed

mailing lists
Assume I have the following tables:

CREATE TABLE A (ID INTEGER PRIMARY KEY, Latitude, Longitude, Altitude);
CREATE VIRTUAL TABLE B USING RTree (ID, FromLatitude, TillLatitude, FromLongitude,TillLongitude);

and there is an index on A for Latitude,Longitude. B is filled using

INSERT INTO B SELECT ID,Latitude,Latitude,Longitude,Longitude FROM A;


According to the RTree documentation this query should be fast (demo_data / demo_index example):

SELECT COUNT(*) FROM A,B WHERE (A.ID=B.ID) AND (FromLongitude > 5) AND (TillLongitude < 10);

Actually, the query is pretty slow. It is slower than a direct (complete) search on A (SELECT COUNT(*) FROM A WHERE (Longitude > 5) AND (Longitude < 10);).

Though this query is fast:

SELECT COUNT(*) FROM A WHERE (ID IN (SELECT ID FROM B WHERE (FromLongitude > 5) AND (TillLongitude < 10)));


Am I doing anything wrong or is the documentation not correct?

Regards,
Hartwig

PS: I used for testing only 70 000 records but I do not think that it really matters.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: RTrees and query speed

Simon Slavin-3

On 8 Dec 2013, at 11:03pm, skywind mailing lists <[hidden email]> wrote:

> SELECT COUNT(*) FROM A,B

Is "A,B" a way of expressing "A JOIN B" ?  I've never seen it before.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: RTrees and query speed

Keith Medcalf
>> SELECT COUNT(*) FROM A,B
>
>Is "A,B" a way of expressing "A JOIN B" ?  I've never seen it before.

Yes.  It is part of the SQL language definition since before there was an SQL language definition.

"," means "JOIN" or more pedantically "INNER JOIN".  "ON" means "WHERE".  The only circumstance in which you need to replace the "," with "JOIN" is if you are using a join predicate such as "NATURAL" or "USING" (or in the case of SQLite, "CROSS", to indicate that you do not want visitation order changed).  In all cases except outer joins "ON" is mere syntactic sugar for putting join conditions in the WHERE clause (including NATURAL and USING forms where the appropriate join condition is merely a clause in the WHERE clause and otherwise has no special significance).  

OUTER joins must use the verbose join syntax because the conditionals in the ON condition of an OUTER join cannot be re-ordered without changing the semantics of the query, although there are other more-or-less standard ways of describing outer joins without using the OUTER JOIN ... ON syntax, such as designating the OUTER join conditions that "would otherwise follow ON" using *= or =* where the * designates the "outer" side of the equijoin condition.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: RTrees and query speed

Clemens Ladisch
In reply to this post by mailing lists
skywind mailing lists wrote:

> Assume I have the following tables:
>
> CREATE TABLE A (ID INTEGER PRIMARY KEY, Latitude, Longitude, Altitude);
> CREATE VIRTUAL TABLE B USING RTree (ID, FromLatitude, TillLatitude, FromLongitude,TillLongitude);
>
> According to the RTree documentation this query should be fast (demo_data / demo_index example):
>
> SELECT COUNT(*) FROM A,B WHERE (A.ID=B.ID) AND (FromLongitude > 5) AND (TillLongitude < 10);
>
> Actually, the query is pretty slow.

According to the EXPLAIN QUERY PLAN output (<http://www.sqlite.org/eqp.html>),
it is fast:

0|0|1|SCAN TABLE B VIRTUAL TABLE INDEX 2:EcCd
0|1|0|SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)

To you get the same output for EXPLAIN QUERY PLAN?
If not, what SQLite version are you using?


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: RTrees and query speed

mailing lists

Am 09.12.2013 um 10:06 schrieb Clemens Ladisch <[hidden email]>:

> skywind mailing lists wrote:
>> Assume I have the following tables:
>>
>> CREATE TABLE A (ID INTEGER PRIMARY KEY, Latitude, Longitude, Altitude);
>> CREATE VIRTUAL TABLE B USING RTree (ID, FromLatitude, TillLatitude, FromLongitude,TillLongitude);
>>
>> According to the RTree documentation this query should be fast (demo_data / demo_index example):
>>
>> SELECT COUNT(*) FROM A,B WHERE (A.ID=B.ID) AND (FromLongitude > 5) AND (TillLongitude < 10);
>>
>> Actually, the query is pretty slow.
>
> According to the EXPLAIN QUERY PLAN output (<http://www.sqlite.org/eqp.html>),
> it is fast:
>
> 0|0|1|SCAN TABLE B VIRTUAL TABLE INDEX 2:EcCd
> 0|1|0|SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
>
> To you get the same output for EXPLAIN QUERY PLAN?
> If not, what SQLite version are you using?
>
>
> Regards,
> Clemens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Hi Clemens,

I am using SQLite 3.7.13.

If I do not run ANALYZE my query plan is the same (I rebuild the tables). BUT my query plan is different after ANALYZE:

0|0|0|SCAN TABLE A (~74067 rows)
0|1|1|SCAN TABLE B VIRTUAL TABLE INDEX 1: (~0 rows)

The problem seems to be (~0 rows) for TABLE B. There are definitely 74067 rows in table B. Therefore, I conclude that ANALYZE is not able to analyze RTree tables correctly and therefore the wrong query plan is chosen.

Regards,
Hartwig

_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users