How to group this?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

How to group this?

Bart Smissaert
ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
----------------------------------------------------------------
1308 15/Mar/2013 Systolic 127 701559
1308 15/Mar/2013 Diastolic 81 701568
1308 27/Jun/2013 Systolic 132 701562
1308 27/Jun/2013 Systolic 141 701563
1308 27/Jun/2013 Systolic 143 701564
1308 27/Jun/2013 Diastolic 82 701571
1308 27/Jun/2013 Diastolic 85 701572
1308 27/Jun/2013 Diastolic 94 701573
278975701 08/Mar/2018 Systolic 136 1583551
278975701 08/Mar/2018 Diastolic 99 1583591
278975701 04/Apr/2018 Systolic 119 1583552
278975701 04/Apr/2018 Systolic 124 1583553
278975701 04/Apr/2018 Systolic 130 1583554
278975701 04/Apr/2018 Diastolic 74 1583592
278975701 04/Apr/2018 Diastolic 75 1583593
278975701 04/Apr/2018 Diastolic 85 1583594

These are systolic and diastolic blood pressures for 2 people with the ID's
1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
Systolic and diastolic values are a pair and should be grouped in one row.
This is no problem if there is only one pair for one date, but sometimes
there multiple pairs per date.
The pairing should be based on the rowed if there are multiple pairs by
date, so for ID 1308
I should get:

127/81
132/82
141/85
143/94

What should be the SQL to group like this?

RBS
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

nomad
On Mon Feb 10, 2020 at 12:24:33AM +0000, Bart Smissaert wrote:
> I should get:
>
> 127/81
> 132/82
> 141/85
> 143/94
>
> What should be the SQL to group like this?

Here is one way that appears to generate the correct result.

    CREATE TABLE pressure(
        id INTEGER PRIMARY KEY,
        userid INTEGER NOT NULL,
        date TEXT NOT NULL,
        term text NOT NULL,
        reading INTEGER NOT NULL
    );

    INSERT INTO
        pressure(userid,date,term,reading,id)
    VALUES
        (1308,'15/Mar/2013','Systolic',127,701559),
        (1308,'15/Mar/2013','Diastolic',81,701568),
        (1308,'27/Jun/2013','Systolic',132,701562),
        (1308,'27/Jun/2013','Systolic',141,701563),
        (1308,'27/Jun/2013','Systolic',143,701564),
        (1308,'27/Jun/2013','Diastolic',82,701571),
        (1308,'27/Jun/2013','Diastolic',85,701572),
        (1308,'27/Jun/2013','Diastolic',94,701573),
        (278975701,'08/Mar/2018','Systolic',136,1583551),
        (278975701,'08/Mar/2018','Diastolic',99,1583591),
        (278975701,'04/Apr/2018','Systolic',119,1583552),
        (278975701,'04/Apr/2018','Systolic',124,1583553),
        (278975701,'04/Apr/2018','Systolic',130,1583554),
        (278975701,'04/Apr/2018','Diastolic',74,1583592),
        (278975701,'04/Apr/2018','Diastolic',75,1583593),
        (278975701,'04/Apr/2018','Diastolic',85,1583594);

    WITH
        systolic AS (
            SELECT
                row_number() OVER (order by id) AS row,
                *
            FROM
                pressure
            WHERE
                term='Systolic'
        ),
        diastolic AS (
            SELECT
                row_number() OVER (order by id) AS row,
                *
            FROM
                pressure
            WHERE
                term='Diastolic'
        )
    SELECT
        d.userid AS User,
        d.date AS Date,
        s.reading || '/' || d.reading AS Reading
    FROM
        systolic s
    LEFT JOIN
        diastolic d
    ON
        d.row=s.row
    ORDER BY
        d.userid,
        d.date,
        d.id
    ;

I'm sure others will produce more elegant solutions - I was just using
your question as an exercise to learn about window functions. The query
plan of the above looks a little scary though:

     id   parent  notused  detail
     2    0       0        CO-ROUTINE 1
     9    2       0        CO-ROUTINE 4
     12   9       0        SCAN TABLE pressure
     26   2       0        SCAN SUBQUERY 4
     72   0       0        MATERIALIZE 2
     79   72      0        CO-ROUTINE 5
     82   79      0        SCAN TABLE pressure
     97   72      0        SCAN SUBQUERY 5
     144  0       0        SCAN SUBQUERY 1 AS s
     159  0       0        SEARCH SUBQUERY 2 AS d USING AUTOMATIC COVERING INDEX
                           (row=?)
     180  0       0        USE TEMP B-TREE FOR ORDER BY

Presumably if you index the date/userid and filter the CTE tables it
gets a little better.

--
Mark Lawrence
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Simon Slavin-3
On 10 Feb 2020, at 1:25am, [hidden email] wrote:

> Here is one way that appears to generate the correct result.

Another way: create two VIEWs, one for systolic, one for diasystolic.  Index both VIEWs on (id, date), then JOIN ON id AND date.  If you want to, you could use this to make a third VIEW which would have the systolic and diasystolic figures on the same row.

If you are going to do more than just this on that data convert the dates to YYYYMMDD form or to an integer day number to make searching and sorting easier.  If you're not confident with programming you could do this in SQLite using CASE for the month name.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Keith Medcalf
In reply to this post by Bart Smissaert
  select id,
         entry_date,
         max(case when term == 'Systolic' then reading end) as Systolic,
         max(case when term == 'Diastolic' then reading end) as Diastolic
    from the_table
group by id, entry_date
;

If you want to make sure you have both terms for a given id/entry_date then you need to make it a subquery so you can check for null.

select id,
       entry_date,
       systolic || '/' || diastolic as pressure
  from (
          select id,
                 entry_date,
                 max(case when term == 'Systolic' then reading end) as Systolic,
                 max(case when term == 'Diastolic' then reading end) as Diastolic
            from the_table
        group by id, entry_date
       )
  where systolic is not null
    and diastolic is not null
;

Note that (id, entry_date, term) needs to be unique ... or you will just get the combination of the max(systolic)/max(diastolic) for the results for a given id/entry_date.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <[hidden email]> On
>Behalf Of Bart Smissaert
>Sent: Sunday, 9 February, 2020 17:25
>To: General Discussion of SQLite Database <sqlite-
>[hidden email]>
>Subject: [sqlite] How to group this?
>
>ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
>----------------------------------------------------------------
>1308 15/Mar/2013 Systolic 127 701559
>1308 15/Mar/2013 Diastolic 81 701568
>1308 27/Jun/2013 Systolic 132 701562
>1308 27/Jun/2013 Systolic 141 701563
>1308 27/Jun/2013 Systolic 143 701564
>1308 27/Jun/2013 Diastolic 82 701571
>1308 27/Jun/2013 Diastolic 85 701572
>1308 27/Jun/2013 Diastolic 94 701573
>278975701 08/Mar/2018 Systolic 136 1583551
>278975701 08/Mar/2018 Diastolic 99 1583591
>278975701 04/Apr/2018 Systolic 119 1583552
>278975701 04/Apr/2018 Systolic 124 1583553
>278975701 04/Apr/2018 Systolic 130 1583554
>278975701 04/Apr/2018 Diastolic 74 1583592
>278975701 04/Apr/2018 Diastolic 75 1583593
>278975701 04/Apr/2018 Diastolic 85 1583594
>
>These are systolic and diastolic blood pressures for 2 people with the
>ID's
>1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
>Systolic and diastolic values are a pair and should be grouped in one
>row.
>This is no problem if there is only one pair for one date, but sometimes
>there multiple pairs per date.
>The pairing should be based on the rowed if there are multiple pairs by
>date, so for ID 1308
>I should get:
>
>127/81
>132/82
>141/85
>143/94
>
>What should be the SQL to group like this?
>
>RBS
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Keith Medcalf
  select id,
         entry_date,
         max(case when term == 'Systolic' then reading end) as Systolic,
         max(case when term == 'Diastolic' then reading end) as Diastolic
    from the_table
group by id, entry_date
;

should be

  select id,
         entry_date,
         max(case when term == 'Systolic' then reading end)
         || '/' ||
         max(case when term == 'Diastolic' then reading end) as Pressure
    from the_table
group by id, entry_date
;

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <[hidden email]> On
>Behalf Of Keith Medcalf
>Sent: Sunday, 9 February, 2020 19:17
>To: SQLite mailing list <[hidden email]>
>Subject: Re: [sqlite] How to group this?
>
>  select id,
>         entry_date,
>         max(case when term == 'Systolic' then reading end) as Systolic,
>         max(case when term == 'Diastolic' then reading end) as Diastolic
>    from the_table
>group by id, entry_date
>;
>
>If you want to make sure you have both terms for a given id/entry_date
>then you need to make it a subquery so you can check for null.
>
>select id,
>       entry_date,
>       systolic || '/' || diastolic as pressure
>  from (
>          select id,
>                 entry_date,
>                 max(case when term == 'Systolic' then reading end) as
>Systolic,
>                 max(case when term == 'Diastolic' then reading end) as
>Diastolic
>            from the_table
>        group by id, entry_date
>       )
>  where systolic is not null
>    and diastolic is not null
>;
>
>Note that (id, entry_date, term) needs to be unique ... or you will just
>get the combination of the max(systolic)/max(diastolic) for the results
>for a given id/entry_date.
>
>--
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>>-----Original Message-----
>>From: sqlite-users <[hidden email]> On
>>Behalf Of Bart Smissaert
>>Sent: Sunday, 9 February, 2020 17:25
>>To: General Discussion of SQLite Database <sqlite-
>>[hidden email]>
>>Subject: [sqlite] How to group this?
>>
>>ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
>>----------------------------------------------------------------
>>1308 15/Mar/2013 Systolic 127 701559
>>1308 15/Mar/2013 Diastolic 81 701568
>>1308 27/Jun/2013 Systolic 132 701562
>>1308 27/Jun/2013 Systolic 141 701563
>>1308 27/Jun/2013 Systolic 143 701564
>>1308 27/Jun/2013 Diastolic 82 701571
>>1308 27/Jun/2013 Diastolic 85 701572
>>1308 27/Jun/2013 Diastolic 94 701573
>>278975701 08/Mar/2018 Systolic 136 1583551
>>278975701 08/Mar/2018 Diastolic 99 1583591
>>278975701 04/Apr/2018 Systolic 119 1583552
>>278975701 04/Apr/2018 Systolic 124 1583553
>>278975701 04/Apr/2018 Systolic 130 1583554
>>278975701 04/Apr/2018 Diastolic 74 1583592
>>278975701 04/Apr/2018 Diastolic 75 1583593
>>278975701 04/Apr/2018 Diastolic 85 1583594
>>
>>These are systolic and diastolic blood pressures for 2 people with the
>>ID's
>>1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
>>Systolic and diastolic values are a pair and should be grouped in one
>>row.
>>This is no problem if there is only one pair for one date, but sometimes
>>there multiple pairs per date.
>>The pairing should be based on the rowed if there are multiple pairs by
>>date, so for ID 1308
>>I should get:
>>
>>127/81
>>132/82
>>141/85
>>143/94
>>
>>What should be the SQL to group like this?
>>
>>RBS
>>_______________________________________________
>>sqlite-users mailing list
>>[hidden email]
>>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Keith Medcalf
In reply to this post by nomad

That's good, but this not screw up later userid/date if an entry is AWOL.

    WITH systolic
      AS (
          select userid,
                 date,
                 rank() over (partition by userid, date order by id) as rank,
                 reading
            from pressure
           where term == 'Systolic'
         ),
         diastolic
      AS (
          select userid,
                 date,
                 rank() over (partition by userid, date order by id) as rank,
                 reading
            from pressure
           where term == 'Diastolic'
         )
    SELECT d.userid,
           d.date,
           s.reading || '/' || d.reading AS Reading
      FROM systolic s
      JOIN diastolic d
        ON d.userid == s.userid
       AND d.date == s.date
       AND d.rank == s.rank
    ;



--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <[hidden email]> On
>Behalf Of [hidden email]
>Sent: Sunday, 9 February, 2020 18:26
>To: SQLite mailing list <[hidden email]>
>Subject: Re: [sqlite] How to group this?
>
>On Mon Feb 10, 2020 at 12:24:33AM +0000, Bart Smissaert wrote:
>> I should get:
>>
>> 127/81
>> 132/82
>> 141/85
>> 143/94
>>
>> What should be the SQL to group like this?
>
>Here is one way that appears to generate the correct result.
>
>    CREATE TABLE pressure(
>        id INTEGER PRIMARY KEY,
>        userid INTEGER NOT NULL,
>        date TEXT NOT NULL,
>        term text NOT NULL,
>        reading INTEGER NOT NULL
>    );
>
>    INSERT INTO
>        pressure(userid,date,term,reading,id)
>    VALUES
>        (1308,'15/Mar/2013','Systolic',127,701559),
>        (1308,'15/Mar/2013','Diastolic',81,701568),
>        (1308,'27/Jun/2013','Systolic',132,701562),
>        (1308,'27/Jun/2013','Systolic',141,701563),
>        (1308,'27/Jun/2013','Systolic',143,701564),
>        (1308,'27/Jun/2013','Diastolic',82,701571),
>        (1308,'27/Jun/2013','Diastolic',85,701572),
>        (1308,'27/Jun/2013','Diastolic',94,701573),
>        (278975701,'08/Mar/2018','Systolic',136,1583551),
>        (278975701,'08/Mar/2018','Diastolic',99,1583591),
>        (278975701,'04/Apr/2018','Systolic',119,1583552),
>        (278975701,'04/Apr/2018','Systolic',124,1583553),
>        (278975701,'04/Apr/2018','Systolic',130,1583554),
>        (278975701,'04/Apr/2018','Diastolic',74,1583592),
>        (278975701,'04/Apr/2018','Diastolic',75,1583593),
>        (278975701,'04/Apr/2018','Diastolic',85,1583594);
>
>    WITH
>        systolic AS (
>            SELECT
>                row_number() OVER (order by id) AS row,
>                *
>            FROM
>                pressure
>            WHERE
>                term='Systolic'
>        ),
>        diastolic AS (
>            SELECT
>                row_number() OVER (order by id) AS row,
>                *
>            FROM
>                pressure
>            WHERE
>                term='Diastolic'
>        )
>    SELECT
>        d.userid AS User,
>        d.date AS Date,
>        s.reading || '/' || d.reading AS Reading
>    FROM
>        systolic s
>    LEFT JOIN
>        diastolic d
>    ON
>        d.row=s.row
>    ORDER BY
>        d.userid,
>        d.date,
>        d.id
>    ;
>
>I'm sure others will produce more elegant solutions - I was just using
>your question as an exercise to learn about window functions. The query
>plan of the above looks a little scary though:
>
>     id   parent  notused  detail
>     2    0       0        CO-ROUTINE 1
>     9    2       0        CO-ROUTINE 4
>     12   9       0        SCAN TABLE pressure
>     26   2       0        SCAN SUBQUERY 4
>     72   0       0        MATERIALIZE 2
>     79   72      0        CO-ROUTINE 5
>     82   79      0        SCAN TABLE pressure
>     97   72      0        SCAN SUBQUERY 5
>     144  0       0        SCAN SUBQUERY 1 AS s
>     159  0       0        SEARCH SUBQUERY 2 AS d USING AUTOMATIC
>COVERING INDEX
>                           (row=?)
>     180  0       0        USE TEMP B-TREE FOR ORDER BY
>
>Presumably if you index the date/userid and filter the CTE tables it
>gets a little better.
>
>--
>Mark Lawrence
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Richard Damon
In reply to this post by Bart Smissaert
On 2/9/20 7:24 PM, Bart Smissaert wrote:

> ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
> ----------------------------------------------------------------
> 1308 15/Mar/2013 Systolic 127 701559
> 1308 15/Mar/2013 Diastolic 81 701568
> 1308 27/Jun/2013 Systolic 132 701562
> 1308 27/Jun/2013 Systolic 141 701563
> 1308 27/Jun/2013 Systolic 143 701564
> 1308 27/Jun/2013 Diastolic 82 701571
> 1308 27/Jun/2013 Diastolic 85 701572
> 1308 27/Jun/2013 Diastolic 94 701573
> 278975701 08/Mar/2018 Systolic 136 1583551
> 278975701 08/Mar/2018 Diastolic 99 1583591
> 278975701 04/Apr/2018 Systolic 119 1583552
> 278975701 04/Apr/2018 Systolic 124 1583553
> 278975701 04/Apr/2018 Systolic 130 1583554
> 278975701 04/Apr/2018 Diastolic 74 1583592
> 278975701 04/Apr/2018 Diastolic 75 1583593
> 278975701 04/Apr/2018 Diastolic 85 1583594
>
> These are systolic and diastolic blood pressures for 2 people with the ID's
> 1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
> Systolic and diastolic values are a pair and should be grouped in one row.
> This is no problem if there is only one pair for one date, but sometimes
> there multiple pairs per date.
> The pairing should be based on the rowed if there are multiple pairs by
> date, so for ID 1308
> I should get:
>
> 127/81
> 132/82
> 141/85
> 143/94
>
> What should be the SQL to group like this?
>
> RBS

To be honest, I think the problem is fundamentally badly designed. You
say pair the two readings by ROWID, but they of course don't have the
same ROWID, but you seem to be saying to pair them sorted by ROWID (1st
to 1st, 2nd to 2nd, etc). The fundamental problem is what if there isn't
the same number of each? You may say that you know that there will
always be the same number, but there is no constraint that forces this,
so any general program is going to have to deal with the possibility
(and at least throw out an error when it sees that).


--
Richard Damon

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Rowan Worth-2
On Mon, 10 Feb 2020 at 11:12, Richard Damon <[hidden email]>
wrote:

> On 2/9/20 7:24 PM, Bart Smissaert wrote:
> > ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
> > ----------------------------------------------------------------
> > 1308 15/Mar/2013 Systolic 127 701559
> > 1308 15/Mar/2013 Diastolic 81 701568
> > 1308 27/Jun/2013 Systolic 132 701562
> > 1308 27/Jun/2013 Systolic 141 701563
> > 1308 27/Jun/2013 Systolic 143 701564
> > 1308 27/Jun/2013 Diastolic 82 701571
> > 1308 27/Jun/2013 Diastolic 85 701572
> > 1308 27/Jun/2013 Diastolic 94 701573
> > 278975701 08/Mar/2018 Systolic 136 1583551
> > 278975701 08/Mar/2018 Diastolic 99 1583591
> > 278975701 04/Apr/2018 Systolic 119 1583552
> > 278975701 04/Apr/2018 Systolic 124 1583553
> > 278975701 04/Apr/2018 Systolic 130 1583554
> > 278975701 04/Apr/2018 Diastolic 74 1583592
> > 278975701 04/Apr/2018 Diastolic 75 1583593
> > 278975701 04/Apr/2018 Diastolic 85 1583594
> >
> > These are systolic and diastolic blood pressures for 2 people with the
> ID's
> > 1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
> > Systolic and diastolic values are a pair and should be grouped in one
> row.
> > This is no problem if there is only one pair for one date, but sometimes
> > there multiple pairs per date.
> > The pairing should be based on the rowed if there are multiple pairs by
> > date, so for ID 1308
> > I should get:
> >
> > 127/81
> > 132/82
> > 141/85
> > 143/94
> >
> > What should be the SQL to group like this?
> >
> > RBS
>
> To be honest, I think the problem is fundamentally badly designed. You
> say pair the two readings by ROWID, but they of course don't have the
> same ROWID, but you seem to be saying to pair them sorted by ROWID (1st
> to 1st, 2nd to 2nd, etc). The fundamental problem is what if there isn't
> the same number of each? You may say that you know that there will
> always be the same number, but there is no constraint that forces this,
> so any general program is going to have to deal with the possibility
> (and at least throw out an error when it sees that).
>

Yeah, it would have been easier to group the readings at write time - eg.
via another column storing the time of day or the "nth reading of the day".
You could still add the latter, post-hoc.

Note that ROWID is not persistent -- see quirk #6 here:
https://www.sqlite.org/rowidtable.html

I would expect that VACUUM's renumbering happens to maintain the row order,
but I doubt it's guaranteed. If you do have an INTEGER PRIMARY KEY it's
better to refer to that directly.

-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Bart Smissaert
In reply to this post by Richard Damon
I fully agree with you, but I sofar I have no control over this data, I
have it like I showed.
As far as I can see there always will be a secondary value, but as you say
I can't be sure.
All this has to do with changing our clinical coding system from Read codes
to Snomed.
In the old setup there was the concept of a secondary value (systolic >>
diastolic), but it
seems in this particular case that is missing.
I get the data by running searches (not SQL) on a clinical database and I
have no control
over this database.
I will see if I can get better data with a different search, to do with
blood pressure values.

RBS

On Mon, Feb 10, 2020 at 3:12 AM Richard Damon <[hidden email]>
wrote:

> On 2/9/20 7:24 PM, Bart Smissaert wrote:
> > ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
> > ----------------------------------------------------------------
> > 1308 15/Mar/2013 Systolic 127 701559
> > 1308 15/Mar/2013 Diastolic 81 701568
> > 1308 27/Jun/2013 Systolic 132 701562
> > 1308 27/Jun/2013 Systolic 141 701563
> > 1308 27/Jun/2013 Systolic 143 701564
> > 1308 27/Jun/2013 Diastolic 82 701571
> > 1308 27/Jun/2013 Diastolic 85 701572
> > 1308 27/Jun/2013 Diastolic 94 701573
> > 278975701 08/Mar/2018 Systolic 136 1583551
> > 278975701 08/Mar/2018 Diastolic 99 1583591
> > 278975701 04/Apr/2018 Systolic 119 1583552
> > 278975701 04/Apr/2018 Systolic 124 1583553
> > 278975701 04/Apr/2018 Systolic 130 1583554
> > 278975701 04/Apr/2018 Diastolic 74 1583592
> > 278975701 04/Apr/2018 Diastolic 75 1583593
> > 278975701 04/Apr/2018 Diastolic 85 1583594
> >
> > These are systolic and diastolic blood pressures for 2 people with the
> ID's
> > 1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
> > Systolic and diastolic values are a pair and should be grouped in one
> row.
> > This is no problem if there is only one pair for one date, but sometimes
> > there multiple pairs per date.
> > The pairing should be based on the rowed if there are multiple pairs by
> > date, so for ID 1308
> > I should get:
> >
> > 127/81
> > 132/82
> > 141/85
> > 143/94
> >
> > What should be the SQL to group like this?
> >
> > RBS
>
> To be honest, I think the problem is fundamentally badly designed. You
> say pair the two readings by ROWID, but they of course don't have the
> same ROWID, but you seem to be saying to pair them sorted by ROWID (1st
> to 1st, 2nd to 2nd, etc). The fundamental problem is what if there isn't
> the same number of each? You may say that you know that there will
> always be the same number, but there is no constraint that forces this,
> so any general program is going to have to deal with the possibility
> (and at least throw out an error when it sees that).
>
>
> --
> Richard Damon
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Richard Damon
In reply to this post by Rowan Worth-2
On 2/9/20 11:44 PM, Rowan Worth wrote:

> On Mon, 10 Feb 2020 at 11:12, Richard Damon <[hidden email]>
> wrote:
>
>> On 2/9/20 7:24 PM, Bart Smissaert wrote:
>>> ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
>>> ----------------------------------------------------------------
>>> 1308 15/Mar/2013 Systolic 127 701559
>>> 1308 15/Mar/2013 Diastolic 81 701568
>>> 1308 27/Jun/2013 Systolic 132 701562
>>> 1308 27/Jun/2013 Systolic 141 701563
>>> 1308 27/Jun/2013 Systolic 143 701564
>>> 1308 27/Jun/2013 Diastolic 82 701571
>>> 1308 27/Jun/2013 Diastolic 85 701572
>>> 1308 27/Jun/2013 Diastolic 94 701573
>>> 278975701 08/Mar/2018 Systolic 136 1583551
>>> 278975701 08/Mar/2018 Diastolic 99 1583591
>>> 278975701 04/Apr/2018 Systolic 119 1583552
>>> 278975701 04/Apr/2018 Systolic 124 1583553
>>> 278975701 04/Apr/2018 Systolic 130 1583554
>>> 278975701 04/Apr/2018 Diastolic 74 1583592
>>> 278975701 04/Apr/2018 Diastolic 75 1583593
>>> 278975701 04/Apr/2018 Diastolic 85 1583594
>>>
>>> These are systolic and diastolic blood pressures for 2 people with the
>> ID's
>>> 1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
>>> Systolic and diastolic values are a pair and should be grouped in one
>> row.
>>> This is no problem if there is only one pair for one date, but sometimes
>>> there multiple pairs per date.
>>> The pairing should be based on the rowed if there are multiple pairs by
>>> date, so for ID 1308
>>> I should get:
>>>
>>> 127/81
>>> 132/82
>>> 141/85
>>> 143/94
>>>
>>> What should be the SQL to group like this?
>>>
>>> RBS
>> To be honest, I think the problem is fundamentally badly designed. You
>> say pair the two readings by ROWID, but they of course don't have the
>> same ROWID, but you seem to be saying to pair them sorted by ROWID (1st
>> to 1st, 2nd to 2nd, etc). The fundamental problem is what if there isn't
>> the same number of each? You may say that you know that there will
>> always be the same number, but there is no constraint that forces this,
>> so any general program is going to have to deal with the possibility
>> (and at least throw out an error when it sees that).
>>
> Yeah, it would have been easier to group the readings at write time - eg.
> via another column storing the time of day or the "nth reading of the day".
> You could still add the latter, post-hoc.
>
> Note that ROWID is not persistent -- see quirk #6 here:
> https://www.sqlite.org/rowidtable.html
>
> I would expect that VACUUM's renumbering happens to maintain the row order,
> but I doubt it's guaranteed. If you do have an INTEGER PRIMARY KEY it's
> better to refer to that directly.
>
> -Rowan

I think Vacuum maintains Row order but not RowID, unless it has been
aliased to a user defined column.

Yes, one solution would be to add another column that can be used either
alone or paired with other columns to allow a join to find the pair of
readings. So modified, the writing application could be updated to write
the same value into that column for the two readings, and/or a fix-it
app could be run to add it to existing data.

SQL itself is unlikely to have a simple way to do this, as the problem
isn't solvable in general, which is what the SQL solution would be
attempting, unless the SQL solution was basically doing the fixup
operation, finding the duplicate ID/Date rows and adding the pairing value.

--
Richard Damon

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Jen Pollock
In reply to this post by Bart Smissaert
I think the following works:

 SELECT s.ID 'ID', s.Date Date, Systolic, Diastolic
 FROM
 (SELECT ID, ENTRY_DATE Date, NUMERIC_VALUE Systolic, row_number() OVER id_date r
  FROM pressure
  WHERE TERM = 'Systolic'
  WINDOW id_date AS (PARTITION BY ID, ENTRY_DATE ORDER BY ID)
 ) s
 JOIN
 (SELECT ID, ENTRY_DATE Date, NUMERIC_VALUE Diastolic, row_number() OVER id_date r
  FROM pressure
  WHERE TERM = 'Diastolic'
  WINDOW id_date AS (PARTITION BY ID, ENTRY_DATE ORDER BY ID)
 ) d
 ON s.ID = d.ID AND s.Date = d.Date AND s.r = d.r
 ORDER BY s.ID, s.Date;

You will need a fairly recent version of SQLite, older versions don't
have window functions. And I have assumed that there is a Diastolic
value for every Systolic value and vice versa.

Jen

On Mon, Feb 10, 2020 at 12:24:33AM +0000, Bart Smissaert wrote:

> ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
> ----------------------------------------------------------------
> 1308 15/Mar/2013 Systolic 127 701559
> 1308 15/Mar/2013 Diastolic 81 701568
> 1308 27/Jun/2013 Systolic 132 701562
> 1308 27/Jun/2013 Systolic 141 701563
> 1308 27/Jun/2013 Systolic 143 701564
> 1308 27/Jun/2013 Diastolic 82 701571
> 1308 27/Jun/2013 Diastolic 85 701572
> 1308 27/Jun/2013 Diastolic 94 701573
> 278975701 08/Mar/2018 Systolic 136 1583551
> 278975701 08/Mar/2018 Diastolic 99 1583591
> 278975701 04/Apr/2018 Systolic 119 1583552
> 278975701 04/Apr/2018 Systolic 124 1583553
> 278975701 04/Apr/2018 Systolic 130 1583554
> 278975701 04/Apr/2018 Diastolic 74 1583592
> 278975701 04/Apr/2018 Diastolic 75 1583593
> 278975701 04/Apr/2018 Diastolic 85 1583594
>
> These are systolic and diastolic blood pressures for 2 people with the ID's
> 1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
> Systolic and diastolic values are a pair and should be grouped in one row.
> This is no problem if there is only one pair for one date, but sometimes
> there multiple pairs per date.
> The pairing should be based on the rowed if there are multiple pairs by
> date, so for ID 1308
> I should get:
>
> 127/81
> 132/82
> 141/85
> 143/94
>
> What should be the SQL to group like this?
>
> RBS
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Jen Pollock
Sorry, I made a typo. The windows should be ORDER BY ROWID, not ORDER BY
ID.

Jen

On Mon, Feb 10, 2020 at 09:19:59AM -0700, Jen Pollock wrote:

> I think the following works:
>
>  SELECT s.ID 'ID', s.Date Date, Systolic, Diastolic
>  FROM
>  (SELECT ID, ENTRY_DATE Date, NUMERIC_VALUE Systolic, row_number() OVER id_date r
>   FROM pressure
>   WHERE TERM = 'Systolic'
>   WINDOW id_date AS (PARTITION BY ID, ENTRY_DATE ORDER BY ID)
>  ) s
>  JOIN
>  (SELECT ID, ENTRY_DATE Date, NUMERIC_VALUE Diastolic, row_number() OVER id_date r
>   FROM pressure
>   WHERE TERM = 'Diastolic'
>   WINDOW id_date AS (PARTITION BY ID, ENTRY_DATE ORDER BY ID)
>  ) d
>  ON s.ID = d.ID AND s.Date = d.Date AND s.r = d.r
>  ORDER BY s.ID, s.Date;
>
> You will need a fairly recent version of SQLite, older versions don't
> have window functions. And I have assumed that there is a Diastolic
> value for every Systolic value and vice versa.
>
> Jen
>
> On Mon, Feb 10, 2020 at 12:24:33AM +0000, Bart Smissaert wrote:
> > ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
> > ----------------------------------------------------------------
> > 1308 15/Mar/2013 Systolic 127 701559
> > 1308 15/Mar/2013 Diastolic 81 701568
> > 1308 27/Jun/2013 Systolic 132 701562
> > 1308 27/Jun/2013 Systolic 141 701563
> > 1308 27/Jun/2013 Systolic 143 701564
> > 1308 27/Jun/2013 Diastolic 82 701571
> > 1308 27/Jun/2013 Diastolic 85 701572
> > 1308 27/Jun/2013 Diastolic 94 701573
> > 278975701 08/Mar/2018 Systolic 136 1583551
> > 278975701 08/Mar/2018 Diastolic 99 1583591
> > 278975701 04/Apr/2018 Systolic 119 1583552
> > 278975701 04/Apr/2018 Systolic 124 1583553
> > 278975701 04/Apr/2018 Systolic 130 1583554
> > 278975701 04/Apr/2018 Diastolic 74 1583592
> > 278975701 04/Apr/2018 Diastolic 75 1583593
> > 278975701 04/Apr/2018 Diastolic 85 1583594
> >
> > These are systolic and diastolic blood pressures for 2 people with the ID's
> > 1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
> > Systolic and diastolic values are a pair and should be grouped in one row.
> > This is no problem if there is only one pair for one date, but sometimes
> > there multiple pairs per date.
> > The pairing should be based on the rowed if there are multiple pairs by
> > date, so for ID 1308
> > I should get:
> >
> > 127/81
> > 132/82
> > 141/85
> > 143/94
> >
> > What should be the SQL to group like this?
> >
> > RBS
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Wolfgang Enzinger
In reply to this post by Simon Slavin-3
Am Mon, 10 Feb 2020 01:42:14 +0000 schrieb Simon Slavin:

> On 10 Feb 2020, at 1:25am, [hidden email] wrote:
>
> create two VIEWs, [...].  Index both VIEWs on (id, date),

I don't think that creating an index on a view actually works, does it?

Wolfgang

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Simon Slavin-3
On 10 Feb 2020, at 10:41pm, Wolfgang Enzinger <[hidden email]> wrote:

> Am Mon, 10 Feb 2020 01:42:14 +0000 schrieb Simon Slavin:
>
>> On 10 Feb 2020, at 1:25am, [hidden email] wrote:
>>
>> create two VIEWs, [...].  Index both VIEWs on (id, date),
>
> I don't think that creating an index on a view actually works, does it?

You're right.  What was I thinking ?  Maybe I've used another implementation of SQL that it does work on.  Thanks for picking me up on it.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Jean-Luc Hainaut
On 11/02/2020 01:35, Simon Slavin wrote:
> I don't think that creating an index on a view actually works, does it?
> You're right.  What was I thinking ?  Maybe I've used another implementation of SQL that it does work on.  Thanks for picking me up on it.

You are right, SQL Server allows you to create an index on a view.

J-L Hainaut

> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: How to group this?

Bart Smissaert
In reply to this post by Bart Smissaert
Solved this now, nil to do with SQL, but just running a different search
(other value code and then you can ask for a secondary value and no need
anymore to find the matching pair).

RBS

On Mon, Feb 10, 2020 at 8:22 AM Bart Smissaert <[hidden email]>
wrote:

> I fully agree with you, but I sofar I have no control over this data, I
> have it like I showed.
> As far as I can see there always will be a secondary value, but as you say
> I can't be sure.
> All this has to do with changing our clinical coding system from Read
> codes to Snomed.
> In the old setup there was the concept of a secondary value (systolic >>
> diastolic), but it
> seems in this particular case that is missing.
> I get the data by running searches (not SQL) on a clinical database and I
> have no control
> over this database.
> I will see if I can get better data with a different search, to do with
> blood pressure values.
>
> RBS
>
> On Mon, Feb 10, 2020 at 3:12 AM Richard Damon <[hidden email]>
> wrote:
>
>> On 2/9/20 7:24 PM, Bart Smissaert wrote:
>> > ID ENTRY_DATE TERM NUMERIC_VALUE ROWID
>> > ----------------------------------------------------------------
>> > 1308 15/Mar/2013 Systolic 127 701559
>> > 1308 15/Mar/2013 Diastolic 81 701568
>> > 1308 27/Jun/2013 Systolic 132 701562
>> > 1308 27/Jun/2013 Systolic 141 701563
>> > 1308 27/Jun/2013 Systolic 143 701564
>> > 1308 27/Jun/2013 Diastolic 82 701571
>> > 1308 27/Jun/2013 Diastolic 85 701572
>> > 1308 27/Jun/2013 Diastolic 94 701573
>> > 278975701 08/Mar/2018 Systolic 136 1583551
>> > 278975701 08/Mar/2018 Diastolic 99 1583591
>> > 278975701 04/Apr/2018 Systolic 119 1583552
>> > 278975701 04/Apr/2018 Systolic 124 1583553
>> > 278975701 04/Apr/2018 Systolic 130 1583554
>> > 278975701 04/Apr/2018 Diastolic 74 1583592
>> > 278975701 04/Apr/2018 Diastolic 75 1583593
>> > 278975701 04/Apr/2018 Diastolic 85 1583594
>> >
>> > These are systolic and diastolic blood pressures for 2 people with the
>> ID's
>> > 1308 and 278975701, ordered by ID asc, ENTRY_DATE asc, ROWID asc.
>> > Systolic and diastolic values are a pair and should be grouped in one
>> row.
>> > This is no problem if there is only one pair for one date, but sometimes
>> > there multiple pairs per date.
>> > The pairing should be based on the rowed if there are multiple pairs by
>> > date, so for ID 1308
>> > I should get:
>> >
>> > 127/81
>> > 132/82
>> > 141/85
>> > 143/94
>> >
>> > What should be the SQL to group like this?
>> >
>> > RBS
>>
>> To be honest, I think the problem is fundamentally badly designed. You
>> say pair the two readings by ROWID, but they of course don't have the
>> same ROWID, but you seem to be saying to pair them sorted by ROWID (1st
>> to 1st, 2nd to 2nd, etc). The fundamental problem is what if there isn't
>> the same number of each? You may say that you know that there will
>> always be the same number, but there is no constraint that forces this,
>> so any general program is going to have to deal with the possibility
>> (and at least throw out an error when it sees that).
>>
>>
>> --
>> Richard Damon
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users