Find key,value duplicates but with differing values

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Find key,value duplicates but with differing values

Dominique Devienne
I can find duplicates fine:

select xmd.partId, parts.title, xmd.name,
       count(*) "#dupplicates",
       group_concat(xmd.value) "values",
       group_concat(xmd.idx) "indexes"
  from extra_meta_data xmd
  join parts on parts.id = xmd.partId
  group by partId, name
 having "#dupplicates" > 1;

but most actual duplicates have the same value, so are harmless.
so I'd like to select only the xmd.name rows which have differing values.

Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
enough I think.
Any hints on how to go about this problem please? Thanks, --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Find key, value duplicates but with differing values

Hick Gunter
Two nested selects
The inner select groups by partId, name, value
The outer select groups by partId, name

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Dominique Devienne
Gesendet: Donnerstag, 11. Oktober 2018 18:00
An: General Discussion of SQLite Database <[hidden email]>
Betreff: [EXTERNAL] [sqlite] Find key,value duplicates but with differing values

I can find duplicates fine:

select xmd.partId, parts.title, xmd.name,
       count(*) "#dupplicates",
       group_concat(xmd.value) "values",
       group_concat(xmd.idx) "indexes"
  from extra_meta_data xmd
  join parts on parts.id = xmd.partId
  group by partId, name
 having "#dupplicates" > 1;

but most actual duplicates have the same value, so are harmless.
so I'd like to select only the xmd.name rows which have differing values.

Knowing that there can be more than 2 duplicates, i.e. a slef-join is not enough I think.
Any hints on how to go about this problem please? Thanks, --DD _______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Find key,value duplicates but with differing values

David Raymond
In reply to this post by Dominique Devienne
Maybe

...
group by partId, name
having count(distinct xmd.value) > 1;

?


-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Dominique Devienne
Sent: Thursday, October 11, 2018 12:00 PM
To: General Discussion of SQLite Database
Subject: [sqlite] Find key,value duplicates but with differing values

I can find duplicates fine:

select xmd.partId, parts.title, xmd.name,
       count(*) "#dupplicates",
       group_concat(xmd.value) "values",
       group_concat(xmd.idx) "indexes"
  from extra_meta_data xmd
  join parts on parts.id = xmd.partId
  group by partId, name
 having "#dupplicates" > 1;

but most actual duplicates have the same value, so are harmless.
so I'd like to select only the xmd.name rows which have differing values.

Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
enough I think.
Any hints on how to go about this problem please? Thanks, --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Find key,value duplicates but with differing values

R Smith-2
In reply to this post by Dominique Devienne
On 2018/10/11 5:59 PM, Dominique Devienne wrote:

> I can find duplicates fine:
>
> select xmd.partId, parts.title, xmd.name,
>         count(*) "#dupplicates",
>         group_concat(xmd.value) "values",
>         group_concat(xmd.idx) "indexes"
>    from extra_meta_data xmd
>    join parts on parts.id = xmd.partId
>    group by partId, name
>   having "#dupplicates" > 1;
>
> but most actual duplicates have the same value, so are harmless.
> so I'd like to select only the xmd.name rows which have differing values.
>
> Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
> enough I think.
> Any hints on how to go about this problem please? Thanks, --DD

2 Ways to solve this that comes to mind initially...

Option 1 - Self join on the Index field and dissimilar second field, and
group,
Option 2 - Use the fancy new Window functions to distill the real
duplicates from the don't-matter duplicates and then count the
non-distinct items.
These two methods shown below:


   -- SQLite version 3.25.1  [ Release: 2018-09-18 ]  on SQLitespeed
version 2.1.1.37.
   --
================================================================================================

CREATE TABLE t(a,b);

INSERT INTO t(a,b) VALUES
(1,300),
(1,100),  -- This is a real duplicate
(1,300),
(1,300),
(2,500),
(2,500),  -- This duplicate does not matter
(3,400),
(3,500),  -- This is again a real duplicate
(3,400)
;
   -- This means we need to be told about Index 1 and 3 which contain
real duplicates.


-- Option 1:
SELECT DISTINCT t1.a FROM t AS t1 JOIN t AS t2 ON t2.a=t1.a AND t2.b<>t1.b
;

   --       a
   -- ------------
   --       1
   --       3

-- Option 2:
SELECT a FROM (SELECT DISTINCT a,MAX(b) OVER (PARTITION BY a,b) FROM t)
GROUP BY a HAVING COUNT(*)>1
;

   --       a
   -- ------------
   --       1
   --       3


HTH,
Ryan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Find key,value duplicates but with differing values

Roman Fleysher
In reply to this post by David Raymond
It is hard for me to tell which is index, which is value and so forth in your example, but how about this single select:

SELECT DISTINCT key, value FROM theTable;

This lists all distinct key-value possibilities. Or,

SELECT key, value FROM (SELECT DISTINCT key, value FROM theTable)
GROUP BY key, value HAVING count() > 1;

This lists all key-value pairs with more than one value for the key.

Roman

________________________________________
From: sqlite-users [[hidden email]] on behalf of David Raymond [[hidden email]]
Sent: Thursday, October 11, 2018 12:23 PM
To: SQLite mailing list
Subject: Re: [sqlite] Find key,value duplicates but with differing values

Maybe

...
group by partId, name
having count(distinct xmd.value) > 1;

?


-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Dominique Devienne
Sent: Thursday, October 11, 2018 12:00 PM
To: General Discussion of SQLite Database
Subject: [sqlite] Find key,value duplicates but with differing values

I can find duplicates fine:

select xmd.partId, parts.title, xmd.name,
       count(*) "#dupplicates",
       group_concat(xmd.value) "values",
       group_concat(xmd.idx) "indexes"
  from extra_meta_data xmd
  join parts on parts.id = xmd.partId
  group by partId, name
 having "#dupplicates" > 1;

but most actual duplicates have the same value, so are harmless.
so I'd like to select only the xmd.name rows which have differing values.

Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
enough I think.
Any hints on how to go about this problem please? Thanks, --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Find key,value duplicates but with differing values

R Smith-2
On 2018/10/11 9:53 PM, Roman Fleysher wrote:

> It is hard for me to tell which is index, which is value and so forth in your example, but how about this single select:
>
> SELECT DISTINCT key, value FROM theTable;
>
> This lists all distinct key-value possibilities. Or,
>
> SELECT key, value FROM (SELECT DISTINCT key, value FROM theTable)
> GROUP BY key, value HAVING count() > 1;
>
> This lists all key-value pairs with more than one value for the key.

If I may - this won't work directly as-is since the query:

SELECT key, value FROM (SELECT DISTINCT key, value FROM theTable)
GROUP BY key, value HAVING count() > 1;

will group by Key,Value and then output which Key-value pairs exhibit
more than one instance (count) - which is physically impossible since
counting the duplicate values from a distinct set is like asking the
number of genders among American presidents before 2018....

The answer is always 1.


If, in the outer query, you Select for (and group by) Key only, then it
works.


>
> Roman
>
> ________________________________________
> From: sqlite-users [[hidden email]] on behalf of David Raymond [[hidden email]]
> Sent: Thursday, October 11, 2018 12:23 PM
> To: SQLite mailing list
> Subject: Re: [sqlite] Find key,value duplicates but with differing values
>
> Maybe
>
> ...
> group by partId, name
> having count(distinct xmd.value) > 1;
>
> ?
>
>
> -----Original Message-----
> From: sqlite-users [mailto:[hidden email]] On Behalf Of Dominique Devienne
> Sent: Thursday, October 11, 2018 12:00 PM
> To: General Discussion of SQLite Database
> Subject: [sqlite] Find key,value duplicates but with differing values
>
> I can find duplicates fine:
>
> select xmd.partId, parts.title, xmd.name,
>         count(*) "#dupplicates",
>         group_concat(xmd.value) "values",
>         group_concat(xmd.idx) "indexes"
>    from extra_meta_data xmd
>    join parts on parts.id = xmd.partId
>    group by partId, name
>   having "#dupplicates" > 1;
>
> but most actual duplicates have the same value, so are harmless.
> so I'd like to select only the xmd.name rows which have differing values.
>
> Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
> enough I think.
> Any hints on how to go about this problem please? Thanks, --DD
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Find key,value duplicates but with differing values

Roman Fleysher
 You are correct. Value should/ could be inside count(), but not in group.




Sent from my T-Mobile 4G LTE Device


-------- Original message --------
From: R Smith <[hidden email]>
Date: 10/11/18 4:29 PM (GMT-05:00)
To: [hidden email]
Subject: Re: [sqlite] Find key,value duplicates but with differing values

On 2018/10/11 9:53 PM, Roman Fleysher wrote:

> It is hard for me to tell which is index, which is value and so forth in your example, but how about this single select:
>
> SELECT DISTINCT key, value FROM theTable;
>
> This lists all distinct key-value possibilities. Or,
>
> SELECT key, value FROM (SELECT DISTINCT key, value FROM theTable)
> GROUP BY key, value HAVING count() > 1;
>
> This lists all key-value pairs with more than one value for the key.

If I may - this won't work directly as-is since the query:

SELECT key, value FROM (SELECT DISTINCT key, value FROM theTable)
GROUP BY key, value HAVING count() > 1;

will group by Key,Value and then output which Key-value pairs exhibit
more than one instance (count) - which is physically impossible since
counting the duplicate values from a distinct set is like asking the
number of genders among American presidents before 2018....

The answer is always 1.


If, in the outer query, you Select for (and group by) Key only, then it
works.


>
> Roman
>
> ________________________________________
> From: sqlite-users [[hidden email]] on behalf of David Raymond [[hidden email]]
> Sent: Thursday, October 11, 2018 12:23 PM
> To: SQLite mailing list
> Subject: Re: [sqlite] Find key,value duplicates but with differing values
>
> Maybe
>
> ...
> group by partId, name
> having count(distinct xmd.value) > 1;
>
> ?
>
>
> -----Original Message-----
> From: sqlite-users [mailto:[hidden email]] On Behalf Of Dominique Devienne
> Sent: Thursday, October 11, 2018 12:00 PM
> To: General Discussion of SQLite Database
> Subject: [sqlite] Find key,value duplicates but with differing values
>
> I can find duplicates fine:
>
> select xmd.partId, parts.title, xmd.name,
>         count(*) "#dupplicates",
>         group_concat(xmd.value) "values",
>         group_concat(xmd.idx) "indexes"
>    from extra_meta_data xmd
>    join parts on parts.id = xmd.partId
>    group by partId, name
>   having "#dupplicates" > 1;
>
> but most actual duplicates have the same value, so are harmless.
> so I'd like to select only the xmd.name rows which have differing values.
>
> Knowing that there can be more than 2 duplicates, i.e. a slef-join is not
> enough I think.
> Any hints on how to go about this problem please? Thanks, --DD
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Find key, value duplicates but with differing values

Dominique Devienne
In reply to this post by Hick Gunter
On Thu, Oct 11, 2018 at 6:21 PM Hick Gunter <[hidden email]> wrote:

> Two nested selects
> The inner select groups by partId, name, value
> The outer select groups by partId, name
>

Thank you who replied, Gunter, Ryan, Roman, David.
This was simpler than I thought. I should have reflected a bit more myself
:) --DD

create view qc$xdata_duplicate_names as

>
> select xdata.partId, parts.title, xdata.name,
>
>        count(*) "#duplicates",
>
>        group_concat(xdata.value) "values",
>
>        group_concat(xdata.idx) "indexes"
>
>   from xdata
>
>   join parts on parts.id = xdata.partId
>
>  group by partId, name
>
> having "#duplicates" > 1
>
>
>> create view qc$xdata_dups_diff_value as
>>
> select xdata.partId, parts.title, xdata.name,
>
>        count(*) "#duplicates",
>
>        group_concat(xdata.value) "values",
>
>        group_concat(xdata.idx) "indexes"
>
>   from (
>
>          select *
>
>            from xdata
>
>           group by partId, name, value
>
>        ) xdata
>
>   join parts on parts.id = xdata.partId
>
>  group by xdata.partId, xdata.name
>
> having "#duplicates" > 1
>
>
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users