Filtering groups by non-grouped field.

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Filtering groups by non-grouped field.

John Found
The following code does not work, but gives an idea what I want to do:

    create table t (a, b);

    select
      group_concat(b) as list
    from t
    group by a
    having ?1 in (list);

i.e. how to select only the groups that contain
some value in the set of values in a column not
specified in group by clause.

The only way I was able to do it is by subquery.
Something like this:

    select
      (select group_concat(b) from t t1 where t1.a = t2.a) as list
    from t t2
    where b = ?1;


--
John Found <[hidden email]>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

Clemens Ladisch
John Found wrote:
> i.e. how to select only the groups that contain
> some value in the set of values in a column not
> specified in group by clause.
>
>     select
>       (select group_concat(b) from t t1 where t1.a = t2.a) as list
>     from t t2
>     where b = ?1;

Similarly:

  select
    group_concat(b) as list
  from t
  where a in (select a
              from t
              where b = ?1)
  group by a;

But you will not be able to avoid the subquery: the filter has to
include all other rows of the group with matching b, and after the
group_concat(), the result is no longer in a form useful for lookups.
(And doing the filter after the grouping is probably not efficient.)


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

John Found
On Fri, 12 Oct 2018 08:27:10 +0200
Clemens Ladisch <[hidden email]> wrote:

> John Found wrote:
> > i.e. how to select only the groups that contain
> > some value in the set of values in a column not
> > specified in group by clause.
> >
> >     select
> >       (select group_concat(b) from t t1 where t1.a = t2.a) as list
> >     from t t2
> >     where b = ?1;
>
> Similarly:
>
>   select
>     group_concat(b) as list
>   from t
>   where a in (select a
>               from t
>               where b = ?1)
>   group by a;
>
> But you will not be able to avoid the subquery: the filter has to
> include all other rows of the group with matching b, and after the
> group_concat(), the result is no longer in a form useful for lookups.
> (And doing the filter after the grouping is probably not efficient.)

Hm, is sounds strange because when HAVING clause is processed,
the aggregate functions should not be processed yet (for a performance
reasons) i.e. the query still has access to all values from the field b
and theoretically should be able to search these values the same way
it searches them on executing min() or max() aggregate functions.

>
>
> Regards,
> Clemens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


--
John Found <[hidden email]>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

Keith Medcalf

On Friday, 12 October, 2018 01:02, John Found <[hidden email]> wrote:

>Hm, is sounds strange because when HAVING clause is processed,
>the aggregate functions should not be processed yet (for a
>performance reasons) i.e. the query still has access to all
>values from the field
>b
>and theoretically should be able to search these values the same way
>it searches them on executing min() or max() aggregate functions.

The WHERE clause filters the table rows going into the GROUP-BY and the HAVING filters results coming out of the GROUP-BY processing, which is why the HAVING clause may only refer to attributes of the grouping ... which is then passed to the ORDER-BY sorter to determine the presentement order.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.






_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

Clemens Ladisch
In reply to this post by John Found
John Found wrote:

> Clemens Ladisch <[hidden email]> wrote:
>>   select
>>     group_concat(b) as list
>>   from t
>>   where a in (select a
>>               from t
>>               where b = ?1)
>>   group by a;
>>
>> But you will not be able to avoid the subquery: the filter has to
>> include all other rows of the group with matching b, and after the
>> group_concat(), the result is no longer in a form useful for lookups.
>
> Hm, is sounds strange because when HAVING clause is processed,
> the aggregate functions should not be processed yet (for a performance
> reasons) i.e. the query still has access to all values from the field b
> and theoretically should be able to search these values the same way
> it searches them on executing min() or max() aggregate functions.

Yes, the HAVING clause can run any aggregate function.  But you need to
find some aggregate function that can do the filtering.  PostgreSQL
would have "HAVING array_position(array_agg(b), $1)", but I have not
been able to get SQLite's json_group_array() to work with this.

And "',' || group_concat(b) || ',' LIKE '%,' || ?1 || ',%'" would be
horrible.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

Clemens Ladisch
I wrote:
> But you need to find some aggregate function that can do the filtering.

HAVING SUM(b = ?1)

(In a database with proper booleans: "HAVING SUM(CASE WHEN b = ? THEN 1 END) > 0")


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

E.Pasma
In reply to this post by John Found

> John Found <[hidden email]>:
>
> The following code does not work, but gives an idea what I want to do:
>
>    create table t (a, b);
>
>    select
>      group_concat(b) as list
>    from t
>    group by a
>    having ?1 in (list);


>
> Clemens Ladisch <[hidden email]> :
>
>  select
>    group_concat(b) as list
>  from t
>  where a in (select a
>              from t
>              where b = ?1)
>  group by a;


I think the original query is the best, here the having clause is be written as valid sql.

select group_concat(b) as list
from t
group by a
having count(b=?1)
;


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

Clemens Ladisch
E.Pasma wrote:
> select group_concat(b) as list
> from t
> group by a
> having count(b=?1)
> ;

In SQLite, a boolean expression returns 0 when false, and count(0) is 1.
You have to generate a NULL for failed matches, or use another function
like sum() or max() that can filter out zeros.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

E.Pasma

> Op 12 okt. 2018, om 11:23 heeft Clemens Ladisch <[hidden email]> het volgende geschreven:
>
> E.Pasma wrote:
>> select group_concat(b) as list
>> from t
>> group by a
>> having count(b=?1)
>> ;
>
> In SQLite, a boolean expression returns 0 when false, and count(0) is 1.
> You have to generate a NULL for failed matches, or use another function
> like sum() or max() that can filter out zeros.
>
Ok, I withdraw my suggestion.
Yours:

 select
   group_concat(b) as list
 from t
 where a in (select a
             from t
             where b = ?1)
 group by a;

looks more meanibgful then.
Also, with large amounts of data, and if column b is indexed, it likely becomes more efficient.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

John Found
In reply to this post by Clemens Ladisch
On Fri, 12 Oct 2018 11:12:17 +0200
Clemens Ladisch <[hidden email]> wrote:

> I wrote:
> > But you need to find some aggregate function that can do the filtering.
>
> HAVING SUM(b = ?1)
>
> (In a database with proper booleans: "HAVING SUM(CASE WHEN b = ? THEN 1 END) > 0")

Hey, this really looks great. Thanks. Also, it seems max(b = ?1) will do the trick as well as count(b = ?1)

And here another question appears. What is more efficient?
At first glance, max() looks better, because it does not need to scan all values from the group.
But is it the case in SQLite implementation?

>
>
> Regards,
> Clemens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


--
John Found <[hidden email]>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

R Smith-2
In reply to this post by John Found
On 2018/10/12 7:38 AM, John Found wrote:

> The following code does not work, but gives an idea what I want to do:
>
>      create table t (a, b);
>
>      select
>        group_concat(b) as list
>      from t
>      group by a
>      having ?1 in (list);
>
> i.e. how to select only the groups that contain
> some value in the set of values in a column not
> specified in group by clause.

My favourite choice is always to try and state the query as close to
what you need it to do.

In English:
Give me all a's where there is a b value equal to ?1 among it's members
, and then group those lines together (by a value) and concatenate the b
values in the output.

(I hope I understood)

Said the same in SQL:

SELECT a, group_concat(b)
   FROM t AS t1
  WHERE EXISTS(SELECT 1 FROM t AS t2 WHERE t2.a=t1.a AND t2.b = ?1)
  GROUP BY a


Now if you were trying to gain absolute fastest speed, the best would be
to first see which a's has a b value of ?1, then simply join to that
list. Two ways of doing that:

Subquery:
SELECT a, group_concat(b)
   FROM (SELECT DISTINCT a FROM t WHERE b=?1) AS X
   JOIN t ON t.a = X.a
  GROUP BY a

CTE:
WITH X(a) AS (SELECT DISTINCT a FROM t WHERE b=?1)
SELECT a, group_concat(b)
   FROM X
JOIN t ON t.a = X.a
  GROUP BY a

-- Another option... --

SELECT a, group_concat(b)
   FROM t
  WHERE t.a IN (SELECT DISTINCT a FROM t WHERE b=?1)
  GROUP BY a


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

Clemens Ladisch
In reply to this post by John Found
John Found wrote:
> Also, it seems max(b = ?1) will do the trick as well as count(b = ?1)
>
> And here another question appears. What is more efficient?

In SQLite, both are equally efficient.

Use whatever makes the query easier to understand.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Filtering groups by non-grouped field.

E.Pasma
Clemens Ladisch wrote:
>
> John Found wrote:
>> Also, it seems max(b = ?1) will do the trick as well as count(b = ?1)
>>
>> And here another question appears. What is more efficient?
>
> In SQLite, both are equally efficient.
>
> Use whatever makes the query easier to understand.
Clemens, I just learned from you that count(b=?1) is not an option as it also counts zero or false. Thanks, E. Pasma
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users