LEFT JOIN: ON vs. WHERE

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

LEFT JOIN: ON vs. WHERE

Ben Asher
Hi there! My colleagues and I are trying to understand the role of ON constraints vs WHERE clauses in JOINs. It seems both of the following work, but I’m not really sure why:

Query A:

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=‘constant’;


Query B:

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant’;

Is there a difference between the two (function and performance)? Is there an advantage to putting WHERE-type filtering in the ON constraint vs leaving it in the WHERE clause for LEFT JOINs?

Thanks!

Ben
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Richard Hipp-3
On 10/27/19, Benjamin Asher <[hidden email]> wrote:
> It seems both of the following work,
> but I’m not really sure why:

I get different answers for the two queries when I try them:

CREATE TABLE tab1(x,y);
CREATE TABLE tab2(x,y);
INSERT INTO tab1 VALUES('constant','string'),('abc','def'),(97,15);
INSERT INTO tab2 SELECT * FROM tab1;

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x='constant';

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x='constant';


--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Simon Slavin-3
In reply to this post by Ben Asher
On 27 Oct 2019, at 10:09pm, Benjamin Asher <[hidden email]> wrote:

> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=‘constant’;

You're doing a JOIN here.  How should the engine know which row of tab2 corresponds to which row of tab1 ?

Your query is syntactically correct, but it doesn't appear to do anything useful.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Ben Asher
In reply to this post by Richard Hipp-3
Oh you’re right. I realize now I messed up the example. Here are the updated queries:

Query A

SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE tab1.x='constant’;

Query B

SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant

These return the same results in your sample setup. I think Simon pointed out what I’m also thinking is the problem: could Query A return rows that aren’t matched up properly because of a lack of a constraint joining the 2 tables?

Thanks for putting together that more complete example :).

Ben

> On Oct 27, 2019, at 3:35 PM, Richard Hipp <[hidden email]> wrote:
>
> On 10/27/19, Benjamin Asher <[hidden email]> wrote:
>> It seems both of the following work,
>> but I’m not really sure why:
>
> I get different answers for the two queries when I try them:
>
> CREATE TABLE tab1(x,y);
> CREATE TABLE tab2(x,y);
> INSERT INTO tab1 VALUES('constant','string'),('abc','def'),(97,15);
> INSERT INTO tab2 SELECT * FROM tab1;
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x='constant';
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x='constant';
>
>
> --
> D. Richard Hipp
> [hidden email]
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Ben Asher
Okay playing with this some more: it seems like since everything is equal to the same constant, it doesn’t really matter? That said, it feels like poor form / not good practice (join can become not useful if you don’t have a constraint joining particular columns). If I want to make it easier on myself to adjust this query in the future, it’s better practice to move constants to the WHERE clause and ensure there are column-based join constraints.

Does that seem right? My colleague presented me with a related query to debug, and now I’m questioning everything.

Ben

> On Oct 27, 2019, at 3:45 PM, Benjamin Asher <[hidden email]> wrote:
>
> Oh you’re right. I realize now I messed up the example. Here are the updated queries:
>
> Query A
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE tab1.x='constant’;
>
> Query B
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant
>
> These return the same results in your sample setup. I think Simon pointed out what I’m also thinking is the problem: could Query A return rows that aren’t matched up properly because of a lack of a constraint joining the 2 tables?
>
> Thanks for putting together that more complete example :).
>
> Ben
>
>> On Oct 27, 2019, at 3:35 PM, Richard Hipp <[hidden email]> wrote:
>>
>> On 10/27/19, Benjamin Asher <[hidden email]> wrote:
>>> It seems both of the following work,
>>> but I’m not really sure why:
>>
>> I get different answers for the two queries when I try them:
>>
>> CREATE TABLE tab1(x,y);
>> CREATE TABLE tab2(x,y);
>> INSERT INTO tab1 VALUES('constant','string'),('abc','def'),(97,15);
>> INSERT INTO tab2 SELECT * FROM tab1;
>>
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x='constant';
>>
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x='constant';
>>
>>
>> --
>> D. Richard Hipp
>> [hidden email]
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Simon Slavin-3
In reply to this post by Ben Asher
On 27 Oct 2019, at 10:45pm, Benjamin Asher <[hidden email]> wrote:

> Query A
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE tab1.x='constant’;
>
> Query B
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant'

Your use of JOIN here is extremely usual.  Depending on your data, you either don't need it, or it does nothing.

Suppose you have many rows in tab1 where x='constant' and also many rows in tab2 where x='constant'.  What are you expecting the engine to do with them ?  Should it match one tab1 row with one tab2 row ?  Or should it generate a set of every possible combination ?  This tells you what your ON clause should be.

Generally, a constant belongs in a WHERE clause.  To answer your original question

> Is there an advantage to putting WHERE-type filtering in the ON constraint vs leaving it in the WHERE clause for LEFT JOINs?

SQLite reads the entire statement and does things to figure out the best way to execute it given the indexes available.  The kind of deconstruction you did to the query in your head, happens inside the SQL engine.  So we cannot give you an answer "always put it in ON" or "always put it in WHERE".  Instead we say "Don't make your query look weird just to save a millisecond.  Have your query make sense, so someone who reads it understands what you are doing.".

If you want to worry about speed or efficiency, worry about making useful indexes for your tables instead of exactly how to phrase your query.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Ben Asher
This is exactly the kind of advice I was looking for.

Thanks again!

Ben

> On Oct 27, 2019, at 4:04 PM, Simon Slavin <[hidden email]> wrote:
>
> On 27 Oct 2019, at 10:45pm, Benjamin Asher <[hidden email]> wrote:
>
>> Query A
>>
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab2.x='constant' WHERE tab1.x='constant’;
>>
>> Query B
>>
>> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant'
>
> Your use of JOIN here is extremely usual.  Depending on your data, you either don't need it, or it does nothing.
>
> Suppose you have many rows in tab1 where x='constant' and also many rows in tab2 where x='constant'.  What are you expecting the engine to do with them ?  Should it match one tab1 row with one tab2 row ?  Or should it generate a set of every possible combination ?  This tells you what your ON clause should be.
>
> Generally, a constant belongs in a WHERE clause.  To answer your original question
>
>> Is there an advantage to putting WHERE-type filtering in the ON constraint vs leaving it in the WHERE clause for LEFT JOINs?
>
> SQLite reads the entire statement and does things to figure out the best way to execute it given the indexes available.  The kind of deconstruction you did to the query in your head, happens inside the SQL engine.  So we cannot give you an answer "always put it in ON" or "always put it in WHERE".  Instead we say "Don't make your query look weird just to save a millisecond.  Have your query make sense, so someone who reads it understands what you are doing.".
>
> If you want to worry about speed or efficiency, worry about making useful indexes for your tables instead of exactly how to phrase your query.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Keith Medcalf
In reply to this post by Ben Asher

On Sunday, 27 October, 2019 16:09, Benjamin Asher <[hidden email]> wrote:

> Is there an advantage to putting WHERE-type filtering in the
> ON constraint vs leaving it in the WHERE clause for LEFT JOINs?

The ON clause is merely syntactic sugar for a parenthesized AND clause appended to the parenthesized WHERE clause EXCEPT in the case of an OUTER JOIN.

That is when you emit:

SELECT <columns>
  FROM <table1>
  JOIN <table2>
    ON <conditions1>
 WHERE <whereconditions>

This is really nothing more than:

SELECT <columns>
  FROM <table1>, <table2>
 WHERE (<conditions1>)
   AND (<wherecondition>)

This applies no matter how many <tablen> there are, nor how may ON <conditionsn> there are.  The ON clause does not even have to mention tables that have already appeared so long as they eventually appear so the WHERE clause can be processed.  JOIN or INNER JOIN is merely an alternate spelling of a comma, and ON merely puts its argument conditional expression in parenthesis and tacks it onto the end of the WHERE conditions with an AND (the where condition clause itself being parenthesized).

In the case of an OUTER JOIN the ON condition specifies the selection constraint for descent into the immediately preceding table (that is, the ON clause binds to the table that is the target of the join).  If the selection criteria in that descent condition cannot be met, that table tuple is replaced with a tuple of all NULL.

Once upon a time there was no ON clause, and one would specify outer constraints with a *.  dibble *= dabble means that you want all the values of dibble even if there is no matching dabble, and the dabble row is therefore comprised of all nulls (this is a left outer join).  Similarly dibble =* dabble meant that one wants all the dabble even if there is no matching dibble, in which case the tuple from which the dibble came will be all nulls (right outer join).  There was also a dibble *=* dabble which meant that you wanted all the results where dibble matched dabble, plus the ones where dibble had no match and the tuple from when dabble came was therefore all nulls, and the ones where dabble had no match in which case the tuple from which dibble came was all nulls (full outer join).

The JOIN .. ON semantics were invented because some people had difficulty composing appropriate WHERE clauses without it.

So for OUTER JOINS (of which a LEFT JOIN is but an instance of the entire class) whether a condition appears in the ON clause or the WHERE clause is material to how the query is performed and the results obtained.  For an INNER JOIN, it does not matter whether the condition is specified in an ON clause or in the WHERE clause.

In fact, in the case of INNER JOIN you do not need either keyword at all:

SELECT a, b
  FROM x, y
    ON x.a == y.c
 WHERE y.q == 5

is the same as:

SELECT a, b
  FROM x, y
    ON y.q == 5
 WHERE x.a == y.c

is the same as

SELECT a, b
  FROM X JOIN Y
 WHERE x.a == y.c
   AND y.q == 5

which is really just:

SELECT a, b
  FROM x, y
 WHERE x.a == y.c
   AND y.q == 5

however, for outer joins:

SELECT a, b
  FROM x LEFT JOIN y
    ON x.a == y.c
 WHERE y.q == 5

is the same as:

SELECT a, b
  FROM x, y
 WHERE x.a == y.c
   AND y.q == 5

(that is, because the WHERE clause requires that y.q not be NULL, the OUTER JOIN is meaningless and merely results is more processing CPU and memory usage than is necessary since the OUTER JOIN is really just an inner join).

SELECT a, b
  FROM x LEFT JOIN y
    ON x.a == x.c and y.q == 5

gives a completely different set of results.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Ben Asher
Ah! I see. Thanks for that walk through of OUTER JOIN.

Ben

> On Oct 27, 2019, at 4:18 PM, Keith Medcalf <[hidden email]> wrote:
>
>
> On Sunday, 27 October, 2019 16:09, Benjamin Asher <[hidden email]> wrote:
>
>> Is there an advantage to putting WHERE-type filtering in the
>> ON constraint vs leaving it in the WHERE clause for LEFT JOINs?
>
> The ON clause is merely syntactic sugar for a parenthesized AND clause appended to the parenthesized WHERE clause EXCEPT in the case of an OUTER JOIN.
>
> That is when you emit:
>
> SELECT <columns>
>  FROM <table1>
>  JOIN <table2>
>    ON <conditions1>
> WHERE <whereconditions>
>
> This is really nothing more than:
>
> SELECT <columns>
>  FROM <table1>, <table2>
> WHERE (<conditions1>)
>   AND (<wherecondition>)
>
> This applies no matter how many <tablen> there are, nor how may ON <conditionsn> there are.  The ON clause does not even have to mention tables that have already appeared so long as they eventually appear so the WHERE clause can be processed.  JOIN or INNER JOIN is merely an alternate spelling of a comma, and ON merely puts its argument conditional expression in parenthesis and tacks it onto the end of the WHERE conditions with an AND (the where condition clause itself being parenthesized).
>
> In the case of an OUTER JOIN the ON condition specifies the selection constraint for descent into the immediately preceding table (that is, the ON clause binds to the table that is the target of the join).  If the selection criteria in that descent condition cannot be met, that table tuple is replaced with a tuple of all NULL.
>
> Once upon a time there was no ON clause, and one would specify outer constraints with a *.  dibble *= dabble means that you want all the values of dibble even if there is no matching dabble, and the dabble row is therefore comprised of all nulls (this is a left outer join).  Similarly dibble =* dabble meant that one wants all the dabble even if there is no matching dibble, in which case the tuple from which the dibble came will be all nulls (right outer join).  There was also a dibble *=* dabble which meant that you wanted all the results where dibble matched dabble, plus the ones where dibble had no match and the tuple from when dabble came was therefore all nulls, and the ones where dabble had no match in which case the tuple from which dibble came was all nulls (full outer join).
>
> The JOIN .. ON semantics were invented because some people had difficulty composing appropriate WHERE clauses without it.
>
> So for OUTER JOINS (of which a LEFT JOIN is but an instance of the entire class) whether a condition appears in the ON clause or the WHERE clause is material to how the query is performed and the results obtained.  For an INNER JOIN, it does not matter whether the condition is specified in an ON clause or in the WHERE clause.
>
> In fact, in the case of INNER JOIN you do not need either keyword at all:
>
> SELECT a, b
>  FROM x, y
>    ON x.a == y.c
> WHERE y.q == 5
>
> is the same as:
>
> SELECT a, b
>  FROM x, y
>    ON y.q == 5
> WHERE x.a == y.c
>
> is the same as
>
> SELECT a, b
>  FROM X JOIN Y
> WHERE x.a == y.c
>   AND y.q == 5
>
> which is really just:
>
> SELECT a, b
>  FROM x, y
> WHERE x.a == y.c
>   AND y.q == 5
>
> however, for outer joins:
>
> SELECT a, b
>  FROM x LEFT JOIN y
>    ON x.a == y.c
> WHERE y.q == 5
>
> is the same as:
>
> SELECT a, b
>  FROM x, y
> WHERE x.a == y.c
>   AND y.q == 5
>
> (that is, because the WHERE clause requires that y.q not be NULL, the OUTER JOIN is meaningless and merely results is more processing CPU and memory usage than is necessary since the OUTER JOIN is really just an inner join).
>
> SELECT a, b
>  FROM x LEFT JOIN y
>    ON x.a == x.c and y.q == 5
>
> gives a completely different set of results.
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Gabor Grothendieck
In reply to this post by Ben Asher
The difference between these two is what happens when a row
of A has no matches in B.

select * from A left join B on A.Time = B.Time
select * from A left join B where A.Time = B.Time

In the first one the condition is carried out during the join
so if a row of A has no matches in B then the B part of the result
is filled out with NULL values.

In the second one A is joined with B and then the "where" is applied
afterwards so if there is no match to a row of A then that row of
A  will not appear at all in the result.

Here is an example in R,  Note that there was no match to the
third row of A in B so in the first case that row of A appears in the
result with
NULLs in the B column positions.  In the second case the third row of A
is dropped from the result.

> library(sqldf)
>
> # BOD comes with R
> A <- BOD[1:3, ]
> B <- BOD[1:2, ]
> A
  Time demand
1    1    8.3
2    2   10.3
3    3   19.0
> B
  Time demand
1    1    8.3
2    2   10.3
>
> sqldf("select * from A left join B on A.Time = B.Time")
  Time demand Time..3 demand..4
1    1    8.3       1       8.3
2    2   10.3       2      10.3
3    3   19.0      NA        NA
> sqldf("select * from A left join B where A.Time = B.Time")
  Time demand Time..3 demand..4
1    1    8.3       1       8.3
2    2   10.3       2      10.3


On Sun, Oct 27, 2019 at 6:09 PM Benjamin Asher <[hidden email]> wrote:

>
> Hi there! My colleagues and I are trying to understand the role of ON constraints vs WHERE clauses in JOINs. It seems both of the following work, but I’m not really sure why:
>
> Query A:
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=‘constant’;
>
>
> Query B:
>
> SELECT * FROM tab1 LEFT JOIN tab2 ON tab1.x=tab2.x WHERE tab1.x=‘constant’;
>
> Is there a difference between the two (function and performance)? Is there an advantage to putting WHERE-type filtering in the ON constraint vs leaving it in the WHERE clause for LEFT JOINs?
>
> Thanks!
>
> Ben
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

James K. Lowden
In reply to this post by Ben Asher
On Sun, 27 Oct 2019 15:09:03 -0700
Benjamin Asher <[hidden email]> wrote:

> My colleagues and I are trying to understand the role of ON
> constraints vs WHERE clauses in JOINs.

ON applies before JOIN.  WHERE applies after.  

That's a loose interpretation, but IMO it's nevertheless a useful way
to think about it.  

It *mostly* doesn't matter.  It was invented for outer joins.  

In SQL-89, there was only the inner join. There was no JOIN keyword.
You listed all the tables in FROM:

        FROM R, S
        WHERE R.key = S.key

and all was good with the world.  

To support outer joins, vendors invented various syntaxes.  To unifiy
them, SQL-92 defined JOIN and ON:

        FROM R JOIN S
        ON R.key = S.key

with that syntax, JOIN could be modifed to be LEFT or RIGHT outer, and
the inner table could be further restricted:  

        FROM R LEFT OUTER JOIN S
        ON R.key = S.key AND S.col = 'foo'

That means: all rows from R joined to rows in S for which col = 'foo'.
If no such S rows exist, every result row will be NULL for the S
columns.  Compare with

        FROM R LEFT OUTER JOIN S
        ON R.key = S.key
        WHERE S.col = 'foo'

Here, R and S are joined, and the WHERE test is applied to the joined
result.  If no such S rows exist, there will be no result rows at all.

When wrestling with this topic, it's useful to remember that ON doesn't
constrain the outer table:

        FROM R LEFT OUTER JOIN S
        ON R.key = S.key AND R.col = 'foo'

Here,

        AND R.col = 'foo'

while valid syntax (unfortunately), has no effect.  R is the outer
table, and all rows of the outer table are always produced, no matter
what ON says about it.  Perhaps the above should mean:

        FROM (select * from R WHERE R.col = 'foo') as R
        LEFT OUTER JOIN S
        ON R.key = S.key

but it does not.  Perfection remains, as ever, elusive.  

--jkl


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Allen, Marc
Sorry for top posting.. I'm way too lazy to figure out how to make Outlook play nicely.

You say:

        FROM R LEFT OUTER JOIN S
    ON R.key = S.key AND R.col = 'foo'
   
        Here,
   
    AND R.col = 'foo'

    while valid syntax (unfortunately), has no effect.  

However, this isn't true.  While it doesn't constrain R, it does constrain the join.  This will still return all rows of R, but all S columns will be NULL for any row where R.col is not 'foo'.

This is actually useful, especially for efficiency.  This allows the database to not even bother doing the lookup on S if R.col != 'foo' and still return that R record.

Marc


Marc L. Allen | Software Engineer IV, PDI Marketing Cloud Solutions
2551 Eltham Ave. Suite N, Norfolk, VA 23513
Office  757.853.3000 x6015
Direct  757.605.6015
[hidden email]
www.pdisoftware.com <http://www.pdisoftware.com>
 

On 10/28/19, 1:20 PM, "sqlite-users on behalf of James K. Lowden" <[hidden email] on behalf of [hidden email]> wrote:

    On Sun, 27 Oct 2019 15:09:03 -0700
    Benjamin Asher <[hidden email]> wrote:
   
    > My colleagues and I are trying to understand the role of ON
    > constraints vs WHERE clauses in JOINs.
   
    ON applies before JOIN.  WHERE applies after.  
   
    That's a loose interpretation, but IMO it's nevertheless a useful way
    to think about it.  
   
    It *mostly* doesn't matter.  It was invented for outer joins.  
   
    In SQL-89, there was only the inner join. There was no JOIN keyword.
    You listed all the tables in FROM:
   
    FROM R, S
    WHERE R.key = S.key
   
    and all was good with the world.  
   
    To support outer joins, vendors invented various syntaxes.  To unifiy
    them, SQL-92 defined JOIN and ON:
   
    FROM R JOIN S
    ON R.key = S.key
   
    with that syntax, JOIN could be modifed to be LEFT or RIGHT outer, and
    the inner table could be further restricted:  
   
    FROM R LEFT OUTER JOIN S
    ON R.key = S.key AND S.col = 'foo'
   
    That means: all rows from R joined to rows in S for which col = 'foo'.
    If no such S rows exist, every result row will be NULL for the S
    columns.  Compare with
   
    FROM R LEFT OUTER JOIN S
    ON R.key = S.key
    WHERE S.col = 'foo'
   
    Here, R and S are joined, and the WHERE test is applied to the joined
    result.  If no such S rows exist, there will be no result rows at all.
   
    When wrestling with this topic, it's useful to remember that ON doesn't
    constrain the outer table:
   
    FROM R LEFT OUTER JOIN S
    ON R.key = S.key AND R.col = 'foo'
   
    Here,
   
    AND R.col = 'foo'
   
    while valid syntax (unfortunately), has no effect.  R is the outer
    table, and all rows of the outer table are always produced, no matter
    what ON says about it.  Perhaps the above should mean:
   
    FROM (select * from R WHERE R.col = 'foo') as R
    LEFT OUTER JOIN S
    ON R.key = S.key
   
    but it does not.  Perfection remains, as ever, elusive.  
   
    --jkl
   
   
    _______________________________________________
    sqlite-users mailing list
    [hidden email]
    http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
   

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

Keith Medcalf
In reply to this post by James K. Lowden

On Monday, 28 October, 2019 11:19, James K. Lowden <[hidden email]> wrote:

>When wrestling with this topic, it's useful to remember that ON doesn't
>constrain the outer table:
>
> FROM R LEFT OUTER JOIN S
> ON R.key = S.key AND R.col = 'foo'

>Here,

> AND R.col = 'foo'

>while valid syntax (unfortunately), has no effect.  R is the outer
>table, and all rows of the outer table are always produced, no matter
>what ON says about it.

Saying that it has no effect is incorrect.  It does constrain the inner loop candidates.  It has no effect on the outer loop candidate selection because we are already inside the inner loop before that condition is evaluated.  In other words, there can be no matching t2 tuple if the t1 tuple does not have t1.a == 1.

CREATE TABLE t1 (a, b);
INSERT INTO t1 VALUES(1,2);
INSERT INTO t1 VALUES(2,3);
INSERT INTO t1 VALUES(1,3);
CREATE TABLE t2 (b, c);
INSERT INTO t2 VALUES(2,3);
INSERT INTO t2 VALUES(3,4);
select * from t1 left join t2 on t1.b == t2.b and t1.a == 1;
1|2|2|3
2|3||
1|3|3|4

This sort of thing is useful, for example, if t2 represents data about the neck length of giraffes, and t1.a tells the animal type (1 == giraffe).  Clearly things which are not giraffe's do not have a giraffe neck length so this constraint is properly part of the inner candidate selection.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

James K. Lowden
On Mon, 28 Oct 2019 12:10:38 -0600
"Keith Medcalf" <[hidden email]> wrote:

> CREATE TABLE t1 (a, b);
> INSERT INTO t1 VALUES(1,2);
> INSERT INTO t1 VALUES(2,3);
> INSERT INTO t1 VALUES(1,3);
> CREATE TABLE t2 (b, c);
> INSERT INTO t2 VALUES(2,3);
> INSERT INTO t2 VALUES(3,4);
> select * from t1 left join t2 on t1.b == t2.b and t1.a == 1;
> 1|2|2|3
> 2|3||
> 1|3|3|4

Thank you, Keith.  What you're saying is that when ON applies to the
outer table, it still constrains the JOIN.  The tuple

        2 3 3 4

would have been produced, but for the ON restriction

        t.a = 1

> This sort of thing is useful

I don't doubt it.  I can't remember ever having written a query like
that (not knowing that's what it would do).  I would probably have
expressed the giraffe-neck problem as

        select * from t1
        left join ( select * from t2 where b <> 1 ) as t2
        on t1.b = t2.b

because usually, in domains I worked with, the constraint on the inner
table wasn't a function of the join criteria, but of some type or group
or catagory column, or date.  

--jkl

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

curmudgeon
If I remember correctly I found out by mistake that the ON doesn’t even have to relate to the JOIN it follows as long as it fits in with the WHERE. E.g.



SELECT * FROM Tbl1 INNER JOIN Tbl2 ON Tbl3.Col=Tbl1.Col INNER JOIN Tbl3 ON Tbl2.Col=Tbl1.Col;



Something like that.





________________________________
From: sqlite-users <[hidden email]> on behalf of James K. Lowden <[hidden email]>
Sent: Monday, October 28, 2019 10:32:21 PM
To: [hidden email] <[hidden email]>
Subject: Re: [sqlite] LEFT JOIN: ON vs. WHERE

On Mon, 28 Oct 2019 12:10:38 -0600
"Keith Medcalf" <[hidden email]> wrote:

> CREATE TABLE t1 (a, b);
> INSERT INTO t1 VALUES(1,2);
> INSERT INTO t1 VALUES(2,3);
> INSERT INTO t1 VALUES(1,3);
> CREATE TABLE t2 (b, c);
> INSERT INTO t2 VALUES(2,3);
> INSERT INTO t2 VALUES(3,4);
> select * from t1 left join t2 on t1.b == t2.b and t1.a == 1;
> 1|2|2|3
> 2|3||
> 1|3|3|4

Thank you, Keith.  What you're saying is that when ON applies to the
outer table, it still constrains the JOIN.  The tuple

        2 3 3 4

would have been produced, but for the ON restriction

        t.a = 1

> This sort of thing is useful

I don't doubt it.  I can't remember ever having written a query like
that (not knowing that's what it would do).  I would probably have
expressed the giraffe-neck problem as

        select * from t1
        left join ( select * from t2 where b <> 1 ) as t2
        on t1.b = t2.b

because usually, in domains I worked with, the constraint on the inner
table wasn't a function of the join criteria, but of some type or group
or catagory column, or date.

--jkl

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LEFT JOIN: ON vs. WHERE

curmudgeon
In reply to this post by Allen, Marc
I don’t think I did say that Marc. All I was saying was the ON didn’t have to refer to the closest join.



select * from

tblA inner join tblB ON tblA.key = tblC key – no mention of tblB in ON

inner join tblC ON tblA.key = tblB.key; – no mention of tblC in ON



Normally that would be written as



select * from

tblA inner join tblB ON tblA.key = tblB key

inner join tblC ON tblA.key = tblC.key;



I was just pointing out that they’re the same thing as the ONs are moved to the where



select * from tblA inner join tblB inner join tblC

where tblA.key = tblB.key AND tblA.key=tblC.key



which I thought illustrated Keith’s point.







________________________________
From: sqlite-users <[hidden email]> on behalf of Allen, Marc <[hidden email]>
Sent: Monday, October 28, 2019 5:27:57 PM
To: SQLite mailing list <[hidden email]>
Subject: Re: [sqlite] LEFT JOIN: ON vs. WHERE

Sorry for top posting.. I'm way too lazy to figure out how to make Outlook play nicely.

You say:

        FROM R LEFT OUTER JOIN S
         ON R.key = S.key AND R.col = 'foo'

        Here,

         AND R.col = 'foo'

         while valid syntax (unfortunately), has no effect.

However, this isn't true.  While it doesn't constrain R, it does constrain the join.  This will still return all rows of R, but all S columns will be NULL for any row where R.col is not 'foo'.

This is actually useful, especially for efficiency.  This allows the database to not even bother doing the lookup on S if R.col != 'foo' and still return that R record.

Marc


Marc L. Allen | Software Engineer IV, PDI Marketing Cloud Solutions
2551 Eltham Ave. Suite N, Norfolk, VA 23513
Office  757.853.3000 x6015
Direct  757.605.6015
[hidden email]
www.pdisoftware.com<http://www.pdisoftware.com> <http://www.pdisoftware.com>


On 10/28/19, 1:20 PM, "sqlite-users on behalf of James K. Lowden" <[hidden email] on behalf of [hidden email]> wrote:

    On Sun, 27 Oct 2019 15:09:03 -0700
    Benjamin Asher <[hidden email]> wrote:

    > My colleagues and I are trying to understand the role of ON
    > constraints vs WHERE clauses in JOINs.

    ON applies before JOIN.  WHERE applies after.

    That's a loose interpretation, but IMO it's nevertheless a useful way
    to think about it.

    It *mostly* doesn't matter.  It was invented for outer joins.

    In SQL-89, there was only the inner join. There was no JOIN keyword.
    You listed all the tables in FROM:

         FROM R, S
         WHERE R.key = S.key

    and all was good with the world.

    To support outer joins, vendors invented various syntaxes.  To unifiy
    them, SQL-92 defined JOIN and ON:

         FROM R JOIN S
         ON R.key = S.key

    with that syntax, JOIN could be modifed to be LEFT or RIGHT outer, and
    the inner table could be further restricted:

         FROM R LEFT OUTER JOIN S
         ON R.key = S.key AND S.col = 'foo'

    That means: all rows from R joined to rows in S for which col = 'foo'.
    If no such S rows exist, every result row will be NULL for the S
    columns.  Compare with

         FROM R LEFT OUTER JOIN S
         ON R.key = S.key
         WHERE S.col = 'foo'

    Here, R and S are joined, and the WHERE test is applied to the joined
    result.  If no such S rows exist, there will be no result rows at all.

    When wrestling with this topic, it's useful to remember that ON doesn't
    constrain the outer table:

         FROM R LEFT OUTER JOIN S
         ON R.key = S.key AND R.col = 'foo'

    Here,

         AND R.col = 'foo'

    while valid syntax (unfortunately), has no effect.  R is the outer
    table, and all rows of the outer table are always produced, no matter
    what ON says about it.  Perhaps the above should mean:

         FROM (select * from R WHERE R.col = 'foo') as R
         LEFT OUTER JOIN S
         ON R.key = S.key

    but it does not.  Perfection remains, as ever, elusive.

    --jkl


    _______________________________________________
    sqlite-users mailing list
    [hidden email]
    http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users