finding the number of records until a value is different

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

finding the number of records until a value is different

nitpilot
Hi all,

I have a DB i.e. like this:

table values
bc temp
35 123
35 124
35 123
20 123
12 123
12 123
16 123
35 123
35 123
35 123
35 123
35 123

The value in temp (or all the other columns) is not of interest.

Now I a looking for the basic concept how to count the number of rows where
bc=bc_of_last_row after the last different bc.

SELECT COUNT(bc) FROM values WHERE bc=35;

gives me the number of all rows with bc=35, so here 8.
But I am looking for 5, the number of rows after the last change of bc.

Regards Matth
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: finding the number of records until a value is different

David Raymond
The question you're gonna get asked of course is "what are you sorting by?"

If it's for example rowid, then this convoluted thing will work, though probably inefficiently.

create table tbl (bc int, temp int);
insert into tbl values (35, 123), (35, 124), (35, 123), (20, 123), (12, 123), (12, 123), (16, 123), (35, 123), (35, 123), (35, 123), (35, 123), (35, 123);

select count(*) from tbl where rowid > (select max(rowid) from tbl where bc != (select bc from tbl where rowid = (select max(rowid) from tbl)));

If you're sorting by multiple columns then it'll get more complex. And I didn't test it with a million record table, so don't know how it'll scale.


-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of [hidden email]
Sent: Thursday, December 07, 2017 1:46 PM
To: [hidden email]
Subject: [sqlite] finding the number of records until a value is different

Hi all,

I have a DB i.e. like this:

table values
bc temp
35 123
35 124
35 123
20 123
12 123
12 123
16 123
35 123
35 123
35 123
35 123
35 123

The value in temp (or all the other columns) is not of interest.

Now I a looking for the basic concept how to count the number of rows where
bc=bc_of_last_row after the last different bc.

SELECT COUNT(bc) FROM values WHERE bc=35;

gives me the number of all rows with bc=35, so here 8.
But I am looking for 5, the number of rows after the last change of bc.

Regards Matth
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: finding the number of records until a value is different

Marc L. Allen
In reply to this post by nitpilot
Ok.... I'm on this list because I love reading all the comments and discussion about sqlite and DBs in general.  I haven't used sqlite in quite awhile, so I don't know how well this will work, but...

Assuming you have a row number as well, such as

CREATE TABLE x
(
        row_number int,
        bc int
)

Then you can use something like:

SELECT a.last_rn - (SELECT MAX(row_number) FROM x WHERE bc != a.last_bc)
FROM (SELECT TOP 1 bc AS last_bc, row_number AS last_rn FROM x ORDER BY row_number DESC) a

I know this works on MS SQL SERVER.  There may be a more efficient way, and you'll need to modify it based on size of table and available indexes.  I think the above should work fairly well if you have indexes on both the row_number and bc.

Marc

-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of [hidden email]
Sent: Thursday, December 7, 2017 1:46 PM
To: [hidden email]
Subject: [sqlite] finding the number of records until a value is different

Hi all,

I have a DB i.e. like this:

table values
bc temp
35 123
35 124
35 123
20 123
12 123
12 123
16 123
35 123
35 123
35 123
35 123
35 123

The value in temp (or all the other columns) is not of interest.

Now I a looking for the basic concept how to count the number of rows where bc=bc_of_last_row after the last different bc.

SELECT COUNT(bc) FROM values WHERE bc=35;

gives me the number of all rows with bc=35, so here 8.
But I am looking for 5, the number of rows after the last change of bc.

Regards Matth
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Confidentiality notice: This e-mail is intended solely for use of the individual or entity to which it is addressed and may contain information that is proprietary, privileged, company confidential and/or exempt from disclosure under applicable law. If the reader is not the intended recipient or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender by reply e-mail or collect telephone call and delete or destroy all copies of this e-mail message, any physical copies made of this e-mail message and/or any file attachment(s).
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: finding the number of records until a value is different

Keith Medcalf
In reply to this post by nitpilot

Have your application read the table in reverse order.  Then when the value of BC changes, stop incrementing a counter and close the select.  You can use whatever ordering you like just so long as it is "descending" (that is reading backwards).  ((Code example in Python -- you can use wordier and/or more obfuscated languages if you so choose -- but anyone should be able to understand this code fragment)).


bc = None
count = 0
for row in db.cursor().execute('select bc from db where bc is not null order by rowid desc;'):
 if bc is None:
  bc = row.bc
 if bc != row.bc:
  break
 count += 1


When the loop is done bc will contain the value of bc, and count will contain the count of that value since it last changed (null values completely ignored since you did not specify as part of your problem statement whether or not there are null values, and what to do with them if there are some, so I assumed that you would want to ignore them -- if this is a school assignment then it is faulty and incompletely specified and the correct answer is that no answer is possible).

It will also be fast like diarrhea from goose on an excessively hot and humid summer day in Hoostun.  Unless of course you will have hugely long runs of the same value at the end of the query, in which case (if say there are a million of two duplicates values at the end of the table) it might be faster to count them in SQL.  Of course, YMMV and applying premature optimization to the problem is almost always counter-productive.

This is because it took about 35 nanoseconds to compute the algorithm to solve the problem.  Computing the correct SQL query (and testing it) will take at least a million orders of magnitude longer (meaning it will cost a million times more to implement).  Time is money and all that.

You can probably do it in SQL but it will be far far far far far far far slower and more inefficient.  Just because you have an SQL database does not mean that *everything* must be written as a single SQL query.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of [hidden email]
>Sent: Thursday, 7 December, 2017 11:46
>To: [hidden email]
>Subject: [sqlite] finding the number of records until a value is
>different
>
>Hi all,
>
>I have a DB i.e. like this:
>
>table values
>bc temp
>35 123
>35 124
>35 123
>20 123
>12 123
>12 123
>16 123
>35 123
>35 123
>35 123
>35 123
>35 123
>
>The value in temp (or all the other columns) is not of interest.
>
>Now I a looking for the basic concept how to count the number of rows
>where
>bc=bc_of_last_row after the last different bc.
>
>SELECT COUNT(bc) FROM values WHERE bc=35;
>
>gives me the number of all rows with bc=35, so here 8.
>But I am looking for 5, the number of rows after the last change of
>bc.
>
>Regards Matth
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: finding the number of records until a value is different

Kees Nuyt
In reply to this post by nitpilot
On Thu, 07 Dec 2017 19:46:21 +0100, [hidden email] wrote:

> Hi all,
>
> I have a DB i.e. like this:
>
> table values
> bc temp
> 35 123
> 35 124
> 35 123
> 20 123
> 12 123
> 12 123
> 16 123
> 35 123
> 35 123
> 35 123
> 35 123
> 35 123
>
> The value in temp (or all the other columns) is not of interest.
>
> Now I a looking for the basic concept how to count the number of rows where
> bc=bc_of_last_row after the last different bc.
>
> SELECT COUNT(bc) FROM values WHERE bc=35;
>
> gives me the number of all rows with bc=35, so here 8.
> But I am looking for 5, the number of rows after the last change of bc.

I recently had a similar but different requirement:
Store a series of CPU temperatures over time, but if the
temperature stays the same, just increment a count.
Again, not the same as your problem, but my solution might
inspire you with yours.

CREATE TABLE cputemp (
        epoch  INTEGER PRIMARY KEY NOT NULL -- "epochfr"
, epochto  INTEGER         DEFAULT NULL
, ctemp    INTEGER             NOT NULL
);

CREATE TRIGGER before_ins_cputemp BEFORE INSERT ON cputemp
FOR EACH ROW
WHEN NEW.ctemp == (SELECT ctemp FROM cputemp WHERE epoch ==
(SELECT max(epoch) FROM cputemp))
BEGIN
        UPDATE cputemp set epochto = NEW.epoch
    WHERE epoch == (SELECT max(epoch) FROM cputemp);
        SELECT RAISE(IGNORE); -- do not insert a new row
END;

-- Optional trigger: set epochto in new rows,
-- could be done in other ways.
CREATE TRIGGER after_ins_cputemp AFTER INSERT ON cputemp
FOR EACH ROW
WHEN NEW.epochto IS NULL
BEGIN
        UPDATE cputemp SET epochto = NEW.epoch
    WHERE epoch == NEW.epoch;
END;

The INSERT looks like (awk code):

        printf "INSERT INTO cputemp (epoch,ctemp) " \
                "VALUES ( 0 + strftime('%%s','now'),%s);\n",cputemp

This can also be done in other ways of course.

Hope this helps.

--
Regards,
Kees Nuyt
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: finding the number of records until a value is different

Max Vlasov
In reply to this post by nitpilot
I think it's possible with CTE.

Recently I wondered whether it would be possible to implement an operation
that might be called "an accumulated group by". It's when you enumerate the
rows and based on the values of the previous row and current row you apply
some new "group" value that can be used in the following "group by" query.
My use case was a "words" table when every word has a length and the task
is to "format" them as the words are formatted when printing a text on the
page or on the screen, so the new "group" here is a printing row that
increased when the current printing line is no longer able to fit the
following word.

I see your task is similar, even a little simpler since we don't accumulate
here, we just compare previous and next values.

CTE worked for me, but the big problem is that there are much redundancy in
the text of the query since contrary to general select queries, it's much
harder to to reusable aliasing in CTE, you will see this in the final query
of this post.

The basic template for any filtering for "accumulated group by".

given the table
  CREATE TABLE [testdata] ([id] integer primary key)

the following query outputs the table as it is but the second select inside
now has a luxury of compare previous and next values (ordered by the
primary key).

with recursive
  filter(curid) as
   (
      select (select min(id) from testdata)
      UNION ALL
      select (select id from testdata where id > curid order by id limit 1)
as nextid from filter where nextid not null
    )
select * from filter

Back to your case

if the table is
  CREATE TABLE [testdata] ([id] integer primary key, [bc] integer, [temp]
integer)

the following gigantic query should output the counts for every consecutive
groups ("grp" here is a temporal column used exclusively for the final
group by I was talking about previously).

with recursive
  filter(curid, bc, temp, grp) as
   (
      select (select min(id) from testdata), (select bc from testdata where
id=(select min(id) from testdata)), (select temp from testdata where
id=(select min(id) from testdata)), 1
      UNION ALL
      select (select id from testdata where id > curid order by id limit 1)
as nextid, (select bc from testdata where id > curid order by id limit 1),
(select temp from testdata where id > curid order by id limit 1),
case when (select bc from testdata where id > curid order by id limit 1) =
bc then grp else grp + 1 end
from filter
  where nextid not null
    )
select bc, count(*) from filter group by grp


I'll be glad to reduce expressions here to some more readable constructs,
but I suspect it's impossible with current SQL syntax.


Max


On Thu, Dec 7, 2017 at 9:46 PM, <[hidden email]> wrote:

> Hi all,
>
> I have a DB i.e. like this:
>
> table values
> bc      temp
> 35      123
> 35      124
> 35      123
> 20      123
> 12      123
> 12      123
> 16      123
> 35      123
> 35      123
> 35      123
> 35      123
> 35      123
>
> The value in temp (or all the other columns) is not of interest.
>
> Now I a looking for the basic concept how to count the number of rows where
> bc=bc_of_last_row after the last different bc.
>
> SELECT COUNT(bc) FROM values WHERE bc=35;
>
> gives me the number of all rows with bc=35, so here 8.
> But I am looking for 5, the number of rows after the last change of bc.
>
> Regards Matth
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users