Bug? SQLite command-line result is different from Perl DBI::Sqlite result

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug? SQLite command-line result is different from Perl DBI::Sqlite result

Keith Medcalf

Hrm.  Inconsistent/incorrect results.  Consider:

create table a(id integer primary key, a);
insert into a values (1,1), (2,1), (3,1);
create table b(id integer primary key, b);
insert into b values (1,2), (3,2), (4,2);
create table c(id integer primary key, c);
insert into c values (1,3), (4,3), (5,3);

select * from a, b, c using (id); -- very strange result

id          a           id          b           c
----------  ----------  ----------  ----------  ----------
1           1           1           2           3
1           1           3           2           3
1           1           4           2           3

select * from a, b using (id), c using (id); -- correct result

id          a           b           c
----------  ----------  ----------  ----------
1           1           2           3

The first query should be processed as:

select * from a, b, c where b.id == c.id;

id          a           id          b           id          c
----------  ----------  ----------  ----------  ----------  ----------
1           1           1           2           1           3
2           1           1           2           1           3
3           1           1           2           1           3
1           1           4           2           4           3
2           1           4           2           4           3
3           1           4           2           4           3

but with the c.id (third id column omitted).

Or it should be processed as the second query if the "using (id)" constraint applies to everywhere an "id" field is found, not just the LHS and RHS tables of the immediately proceeding join.

also
select * from a natural join b natural join c;
-- returns no rows despite the column "id" existing commonly in all tables

This is with the current development release (and as far as I can tell, all prior versions).

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <[hidden email]> On
>Behalf Of Keith Medcalf
>Sent: Saturday, 4 January, 2020 19:32
>To: SQLite mailing list <[hidden email]>
>Subject: Re: [sqlite] SQLite command-line result is different from Perl
>DBI::Sqlite result
>
>
>
>
>
>--
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>On Saturday, 4 January, 2020 18:31, Amer Neely <[hidden email]>
>wrote:
>
>>I'm fairly new to SQLite, but have been using MySQL / mariadb in a local
>>and web-based environment for several years. So far I'm happy and
>>impressed with SQLite, but I recently noticed some odd behaviour with
>>one of my queries.
>>Using the command-line in a shell (Mac High Sierra) I get a particular
>>result from a query. The exact same query in a Perl script gives me a
>>different result. To my mind it is a simple query, getting the 5 latest
>>additions to my music library.
>>Command-line:
>><code>select artists.artist, artists.artistid, cds.title, cds.artistid,
>>cds.cdid, genres.genre, genres.artistid from artists, genres inner join
>>cds using (artistid) group by artists.artistid order by cds.id desc
>>limit 5;</code>
>>gives me the correct result. However, in a Perl script it gives me a
>>different result. How is that possible? Could it be a Perl::DBI issue?
>>Many thanks for anyone able to shed some light on this.
>
>Your select does not constrain artists so the result is non-deterministic
>in that the result will depend on how the query planner decides to
>execute the query.  That is, you have not specified any join constraints
>on artists.
>
>SELECT * FROM A, B JOIN C USING (D);
>
>means
>
>SELECT *
>  FROM A, B, C
> WHERE B.D == C.D;
>
>if you thought it meant
>
>SELECT *
>  FROM A, B, C
> WHERE A.D == B.D
>   AND B.D == C.D;
>
>then that is likely the reason for the discrepancy.
>
>
>
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug? SQLite command-line result is different from Perl DBI::Sqlite result

Richard Hipp-3
On 1/5/20, Keith Medcalf <[hidden email]> wrote:

>
> Hrm.  Inconsistent/incorrect results.  Consider:
>
> create table a(id integer primary key, a);
> insert into a values (1,1), (2,1), (3,1);
> create table b(id integer primary key, b);
> insert into b values (1,2), (3,2), (4,2);
> create table c(id integer primary key, c);
> insert into c values (1,3), (4,3), (5,3);
>
> select * from a, b, c using (id); -- very strange result
>
> id          a           id          b           c
> ----------  ----------  ----------  ----------  ----------
> 1           1           1           2           3
> 1           1           3           2           3
> 1           1           4           2           3

PostgreSQL and MySQL process the query as follows:

   SELECT * FROM a, (b JOIN c USING(id));

SQLite processes the query like this:

   SELECT * FROM (a,b) JOIN c USING (id);

I don't know which is correct.  Perhaps the result is undefined.

Note that both MySQL and SQLite do allow you to use parentheses, as
shown in my examples, to define the order of evaluation.  PostgreSQL
does not, sadly.

MS-SQL does not (as far as I can tell) support the USING syntax on a join.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug? SQLite command-line result is different from Perl DBI::Sqlite result

Richard Hipp-3
On 1/5/20, Richard Hipp <[hidden email]> wrote:
>
> Note that both MySQL and SQLite do allow you to use parentheses, as
> shown in my examples, to define the order of evaluation.  PostgreSQL
> does not, sadly.
>

Apparently, in PostgreSQL you have to say:

   SELECT * FROM (SELECT * FROM a, b) AS x JOIN c USING(id);

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug? SQLite command-line result is different from Perl DBI::Sqlite result

Amer Neely
In reply to this post by Keith Medcalf
> On 1/5/20, Keith Medcalf <[hidden email]> wrote:
>>
>> Hrm.  Inconsistent/incorrect results.  Consider:
>>
>> create table a(id integer primary key, a);
>> insert into a values (1,1), (2,1), (3,1);
>> create table b(id integer primary key, b);
>> insert into b values (1,2), (3,2), (4,2);
>> create table c(id integer primary key, c);
>> insert into c values (1,3), (4,3), (5,3);
>>
>> select * from a, b, c using (id); -- very strange result
>>
>> id          a           id          b           c
>> ----------  ----------  ----------  ----------  ----------
>> 1           1           1           2           3
>> 1           1           3           2           3
>> 1           1           4           2           3
>
> PostgreSQL and MySQL process the query as follows:
>
>    SELECT * FROM a, (b JOIN c USING(id));
>
> SQLite processes the query like this:
>
>    SELECT * FROM (a,b) JOIN c USING (id);
>
> I don't know which is correct.  Perhaps the result is undefined.
>
> Note that both MySQL and SQLite do allow you to use parentheses, as
> shown in my examples, to define the order of evaluation.  PostgreSQL
> does not, sadly.
>
> MS-SQL does not (as far as I can tell) support the USING syntax on a join.
>
> --
> D. Richard Hipp
> [hidden email]

Ahh. More to learn. Thank you for the use of parentheses, I will have to
check my queries for that.

i did manage to get the query working by grouping on artists.artistid.
--
Amer Neely
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug? SQLite command-line result is different from Perl DBI::Sqlite result

Keith Medcalf
In reply to this post by Richard Hipp-3

On Sunday, 5 January, 2020 04:42, Richard Hipp <[hidden email]> wrote:

>On 1/5/20, Keith Medcalf <[hidden email]> wrote:

>> Hrm.  Inconsistent/incorrect results.  Consider:

>> create table a(id integer primary key, a);
>> insert into a values (1,1), (2,1), (3,1);
>> create table b(id integer primary key, b);
>> insert into b values (1,2), (3,2), (4,2);
>> create table c(id integer primary key, c);
>> insert into c values (1,3), (4,3), (5,3);

>> select * from a, b, c using (id); -- very strange result

>> id          a           id          b           c
>> ----------  ----------  ----------  ----------  ----------
>> 1           1           1           2           3
>> 1           1           3           2           3
>> 1           1           4           2           3

>PostgreSQL and MySQL process the query as follows:

>   SELECT * FROM a, (b JOIN c USING(id));

>SQLite processes the query like this:

>   SELECT * FROM (a,b) JOIN c USING (id);

>I don't know which is correct.  Perhaps the result is undefined.

>Note that both MySQL and SQLite do allow you to use parentheses, as
>shown in my examples, to define the order of evaluation.  PostgreSQL
>does not, sadly.

>MS-SQL does not (as far as I can tell) support the USING syntax on a
>join.

Aha!  So as far as SQLite is concerned the syntax "... JOIN <x> USING (<y>)" is effectively binding the using expression for the nested nested loop descent into into table <y> only and does not bind against the immediately preceeding LHS JOIN table.  

By adding some indexes and order by that cause the nesting order to change it appear that "a, b join c using (id)" is always processed as selecting the first lexically named id column irrepective of nesting order (that is "a, b join c using (id)" always becomes "a, b, c where a.id == c.id" and that "b, a join c using (id)" always becomes "a, b, c where b.id == c.id" even when the optimizer chooses to re-arrange the nesting order (such as by additional indexes and order by's)).

Since the "id" column to use is ambiguous for descent into "c" should not an "ambiguous column name" error be thrown?  The ambiguity only does not exist if ALL columns named "id" (for all tables that could be in an outer loop respective to "c") are constrained equal -- that is a,b,c using (id) -> a, b, c where a.id == b.id and a.id == c.id and b.id == c.id.

Since "natural join" devolves into a using, does not the same problem exist there as well?

Since any change is likely to have an effect on already existing and functional applications, could the behaviour be documented somewhere perhaps?

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug? SQLite command-line result is different from Perl DBI::Sqlite result

Clemens Ladisch
In reply to this post by Richard Hipp-3
Richard Hipp wrote:

> On 1/5/20, Keith Medcalf <[hidden email]> wrote:
>> select * from a, b, c using (id); -- very strange result
>
> PostgreSQL and MySQL process the query as follows:
>
>    SELECT * FROM a, (b JOIN c USING(id));
>
> SQLite processes the query like this:
>
>    SELECT * FROM (a,b) JOIN c USING (id);
>
> I don't know which is correct.  Perhaps the result is undefined.

Assuming the following query:

  SELECT * FROM a, b JOIN c USING (id);

SQL-92 says:
|        7.4  <from clause>
|
|        <from clause> ::= FROM <table reference> [ { <comma> <table reference> }... ]
|
|        6.3  <table reference>
|
|        <table reference> ::=
|               <table name> [ [ AS ] <correlation name>
|                   [ <left paren> <derived column list> <right paren> ] ]
|             | <derived table> [ AS ] <correlation name>
|                   [ <left paren> <derived column list> <right paren> ]
|             | <joined table>
|
|        7.5  <joined table>
|
|        <joined table> ::=
|               <cross join>
|             | <qualified join>
|             | <left paren> <joined table> <right paren>
|
|        <cross join> ::=
|             <table reference> CROSS JOIN <table reference>
|
|        <qualified join> ::=
|             <table reference> [ NATURAL ] [ <join type> ] JOIN
|               <table reference> [ <join specification> ]

It is not possible to have such a <comma> inside a <table reference>, so
b and c must be joined first.

SQLite actually parses the comma as a join:

  SELECT * FROM a CROSS JOIN b JOIN c USING (id);

If the query were written like this, joining a and b first would be
correct.  (As far as I can see, the standard does not say how to handle
ambiguous parts of the grammar, so it would also be allowed to produce
"b JOIN c" first.)


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug? SQLite command-line result is different from Perl DBI::Sqlite result

Igor Korot
Hi,

On Tue, Jan 7, 2020 at 2:01 AM Clemens Ladisch <[hidden email]> wrote:

>
> Richard Hipp wrote:
> > On 1/5/20, Keith Medcalf <[hidden email]> wrote:
> >> select * from a, b, c using (id); -- very strange result
> >
> > PostgreSQL and MySQL process the query as follows:
> >
> >    SELECT * FROM a, (b JOIN c USING(id));
> >
> > SQLite processes the query like this:
> >
> >    SELECT * FROM (a,b) JOIN c USING (id);
> >
> > I don't know which is correct.  Perhaps the result is undefined.
>
> Assuming the following query:
>
>   SELECT * FROM a, b JOIN c USING (id);
>
> SQL-92 says:
> |        7.4  <from clause>
> |
> |        <from clause> ::= FROM <table reference> [ { <comma> <table reference> }... ]
> |
> |        6.3  <table reference>
> |
> |        <table reference> ::=
> |               <table name> [ [ AS ] <correlation name>
> |                   [ <left paren> <derived column list> <right paren> ] ]
> |             | <derived table> [ AS ] <correlation name>
> |                   [ <left paren> <derived column list> <right paren> ]
> |             | <joined table>
> |
> |        7.5  <joined table>
> |
> |        <joined table> ::=
> |               <cross join>
> |             | <qualified join>
> |             | <left paren> <joined table> <right paren>
> |
> |        <cross join> ::=
> |             <table reference> CROSS JOIN <table reference>
> |
> |        <qualified join> ::=
> |             <table reference> [ NATURAL ] [ <join type> ] JOIN
> |               <table reference> [ <join specification> ]
>
> It is not possible to have such a <comma> inside a <table reference>, so
> b and c must be joined first.
>
> SQLite actually parses the comma as a join:
>
>   SELECT * FROM a CROSS JOIN b JOIN c USING (id);
>
> If the query were written like this, joining a and b first would be
> correct.  (As far as I can see, the standard does not say how to handle
> ambiguous parts of the grammar, so it would also be allowed to produce
> "b JOIN c" first.)

That's why one should never use that "MS JOIN extension" and should
simply write:

SELECT ... FROM a,b,c WHERE a.x = b.x AND b.x = c.y AND...;

Thank you.

>
>
> Regards,
> Clemens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users