Kind of pivot table

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Kind of pivot table

Cecil Westerhof-5
I thought there was a problem with RANDOM. I used:
    ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser

And it seemed I got a lot of threes.

To check this I used:
    SELECT Randomiser
    ,      COUNT(*)   AS Count
    FROM (
        SELECT   date
        ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
        FROM     CPUUsage
        ORDER BY date
    )
    GROUP BY Randomiser
    ORDER BY Randomiser

And this gave results like:
    "0"    "165491"
    "1"    "166270"
    "2"    "166207"
    "3"    "165727"
    "4"    "165619"
    "5"    "165749"
    "6"    "98042"

So 6 is created less often as 0 - 5, but that is in my use case not a
problem.

This worked for me because I have a big table CPUUsage. But if I would not
have, is there another way to to do this?

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Cecil Westerhof-5
2018-07-08 8:19 GMT+02:00 Cecil Westerhof <[hidden email]>:

> I thought there was a problem with RANDOM. I used:
>     ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
>
> And it seemed I got a lot of threes.
>
> To check this I used:
>     SELECT Randomiser
>     ,      COUNT(*)   AS Count
>     FROM (
>         SELECT   date
>         ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
>         FROM     CPUUsage
>         ORDER BY date
>     )
>     GROUP BY Randomiser
>     ORDER BY Randomiser
>
> And this gave results like:
>     "0"    "165491"
>     "1"    "166270"
>     "2"    "166207"
>     "3"    "165727"
>     "4"    "165619"
>     "5"    "165749"
>     "6"    "98042"
>
> So 6 is created less often as 0 - 5, but that is in my use case not a
> problem.
>
> This worked for me because I have a big table CPUUsage. But if I would not
> have, is there another way to to do this?
>

By the way better select is:
    SELECT Randomiser
    ,      COUNT(*)   AS Count
    FROM (
        SELECT   date
        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER) AS
Randomiser
        FROM     CPUUsage
        ORDER BY date
    )
    GROUP BY Randomiser
    ORDER BY Randomiser

This gives:
    "0"    "156204"
    "1"    "157032"
    "2"    "155636"
    "3"    "156399"
    "4"    "156256"
    "5"    "155480"
    "6"    "156073"
    "7"    "52"

This is much better. Only very rarely you get a seven you do not want.
(Again in my case not really a problem.)

Because in my case I use Randomiser to get a small subset of the records,
this can be solved with:
    SELECT Randomiser
    ,      COUNT(*)   AS Count
    FROM (
        SELECT   date
        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER) AS
Randomiser
        FROM     CPUUsage
        ORDER BY date
    )
    WHERE    Randomiser <> 7
    GROUP BY Randomiser
    ORDER BY Randomiser


​Then I get something like:
    "0"    "155806"
    "1"    "156270"
    "2"    "156473"
    "3"    "155748"
    "4"    "155828"
    "5"    "156196"
    "6"    "156733"​

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Keith Medcalf

Why not use MOD (%) as in

ABS(RANDOM() % 6)


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Cecil Westerhof
>Sent: Sunday, 8 July, 2018 00:44
>To: SQLite mailing list
>Subject: Re: [sqlite] Kind of pivot table
>
>2018-07-08 8:19 GMT+02:00 Cecil Westerhof <[hidden email]>:
>
>> I thought there was a problem with RANDOM. I used:
>>     ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
>>
>> And it seemed I got a lot of threes.
>>
>> To check this I used:
>>     SELECT Randomiser
>>     ,      COUNT(*)   AS Count
>>     FROM (
>>         SELECT   date
>>         ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS
>Randomiser
>>         FROM     CPUUsage
>>         ORDER BY date
>>     )
>>     GROUP BY Randomiser
>>     ORDER BY Randomiser
>>
>> And this gave results like:
>>     "0"    "165491"
>>     "1"    "166270"
>>     "2"    "166207"
>>     "3"    "165727"
>>     "4"    "165619"
>>     "5"    "165749"
>>     "6"    "98042"
>>
>> So 6 is created less often as 0 - 5, but that is in my use case not
>a
>> problem.
>>
>> This worked for me because I have a big table CPUUsage. But if I
>would not
>> have, is there another way to to do this?
>>
>
>By the way better select is:
>    SELECT Randomiser
>    ,      COUNT(*)   AS Count
>    FROM (
>        SELECT   date
>        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER) AS
>Randomiser
>        FROM     CPUUsage
>        ORDER BY date
>    )
>    GROUP BY Randomiser
>    ORDER BY Randomiser
>
>This gives:
>    "0"    "156204"
>    "1"    "157032"
>    "2"    "155636"
>    "3"    "156399"
>    "4"    "156256"
>    "5"    "155480"
>    "6"    "156073"
>    "7"    "52"
>
>This is much better. Only very rarely you get a seven you do not
>want.
>(Again in my case not really a problem.)
>
>Because in my case I use Randomiser to get a small subset of the
>records,
>this can be solved with:
>    SELECT Randomiser
>    ,      COUNT(*)   AS Count
>    FROM (
>        SELECT   date
>        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER) AS
>Randomiser
>        FROM     CPUUsage
>        ORDER BY date
>    )
>    WHERE    Randomiser <> 7
>    GROUP BY Randomiser
>    ORDER BY Randomiser
>​
>
>​Then I get something like:
>    "0"    "155806"
>    "1"    "156270"
>    "2"    "156473"
>    "3"    "155748"
>    "4"    "155828"
>    "5"    "156196"
>    "6"    "156733"​
>
>--
>Cecil Westerhof
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Cecil Westerhof-5
2018-07-08 8:49 GMT+02:00 Keith Medcalf <[hidden email]>:

>
> Why not use MOD (%) as in
>
> ABS(RANDOM() % 6)
>

​You are completely right. How stupid of me. :'-(

It only has to be:
    ABS(RANDOM() % 7)



> >-----Original Message-----
> >From: sqlite-users [mailto:sqlite-users-
> >[hidden email]] On Behalf Of Cecil Westerhof
> >Sent: Sunday, 8 July, 2018 00:44
> >To: SQLite mailing list
> >Subject: Re: [sqlite] Kind of pivot table
> >
> >2018-07-08 8:19 GMT+02:00 Cecil Westerhof <[hidden email]>:
> >
> >> I thought there was a problem with RANDOM. I used:
> >>     ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
> >>
> >> And it seemed I got a lot of threes.
> >>
> >> To check this I used:
> >>     SELECT Randomiser
> >>     ,      COUNT(*)   AS Count
> >>     FROM (
> >>         SELECT   date
> >>         ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS
> >Randomiser
> >>         FROM     CPUUsage
> >>         ORDER BY date
> >>     )
> >>     GROUP BY Randomiser
> >>     ORDER BY Randomiser
> >>
> >> And this gave results like:
> >>     "0"    "165491"
> >>     "1"    "166270"
> >>     "2"    "166207"
> >>     "3"    "165727"
> >>     "4"    "165619"
> >>     "5"    "165749"
> >>     "6"    "98042"
> >>
> >> So 6 is created less often as 0 - 5, but that is in my use case not
> >a
> >> problem.
> >>
> >> This worked for me because I have a big table CPUUsage. But if I
> >would not
> >> have, is there another way to to do this?
> >>
> >
> >By the way better select is:
> >    SELECT Randomiser
> >    ,      COUNT(*)   AS Count
> >    FROM (
> >        SELECT   date
> >        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER) AS
> >Randomiser
> >        FROM     CPUUsage
> >        ORDER BY date
> >    )
> >    GROUP BY Randomiser
> >    ORDER BY Randomiser
> >
> >This gives:
> >    "0"    "156204"
> >    "1"    "157032"
> >    "2"    "155636"
> >    "3"    "156399"
> >    "4"    "156256"
> >    "5"    "155480"
> >    "6"    "156073"
> >    "7"    "52"
> >
> >This is much better. Only very rarely you get a seven you do not
> >want.
> >(Again in my case not really a problem.)
> >
> >Because in my case I use Randomiser to get a small subset of the
> >records,
> >this can be solved with:
> >    SELECT Randomiser
> >    ,      COUNT(*)   AS Count
> >    FROM (
> >        SELECT   date
> >        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER) AS
> >Randomiser
> >        FROM     CPUUsage
> >        ORDER BY date
> >    )
> >    WHERE    Randomiser <> 7
> >    GROUP BY Randomiser
> >    ORDER BY Randomiser
> >​
> >
> >​Then I get something like:
> >    "0"    "155806"
> >    "1"    "156270"
> >    "2"    "156473"
> >    "3"    "155748"
> >    "4"    "155828"
> >    "5"    "156196"
> >    "6"    "156733"​
>

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Keith Medcalf

sqlite> select kurt(abs(random() % 7)) from generate_series where start=1 and stop=1e6;
-1.25154453962449

sqlite> select skew(abs(random() % 7)) from generate_series where start=1 and stop=1e6;
0.00104535938599554

The PRNG is pretty random.  

It is slightly concave (that is, anti-normal) (a "flat" distribution would have a kurtosis of -1.2) and the curve is slightly skewed above the average.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Cecil Westerhof
>Sent: Sunday, 8 July, 2018 00:59
>To: SQLite mailing list
>Subject: Re: [sqlite] Kind of pivot table
>
>2018-07-08 8:49 GMT+02:00 Keith Medcalf <[hidden email]>:
>
>>
>> Why not use MOD (%) as in
>>
>> ABS(RANDOM() % 6)
>>
>
>​You are completely right. How stupid of me. :'-(
>
>It only has to be:
>    ABS(RANDOM() % 7)
>
>
>
>> >-----Original Message-----
>> >From: sqlite-users [mailto:sqlite-users-
>> >[hidden email]] On Behalf Of Cecil Westerhof
>> >Sent: Sunday, 8 July, 2018 00:44
>> >To: SQLite mailing list
>> >Subject: Re: [sqlite] Kind of pivot table
>> >
>> >2018-07-08 8:19 GMT+02:00 Cecil Westerhof
><[hidden email]>:
>> >
>> >> I thought there was a problem with RANDOM. I used:
>> >>     ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS
>Randomiser
>> >>
>> >> And it seemed I got a lot of threes.
>> >>
>> >> To check this I used:
>> >>     SELECT Randomiser
>> >>     ,      COUNT(*)   AS Count
>> >>     FROM (
>> >>         SELECT   date
>> >>         ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS
>> >Randomiser
>> >>         FROM     CPUUsage
>> >>         ORDER BY date
>> >>     )
>> >>     GROUP BY Randomiser
>> >>     ORDER BY Randomiser
>> >>
>> >> And this gave results like:
>> >>     "0"    "165491"
>> >>     "1"    "166270"
>> >>     "2"    "166207"
>> >>     "3"    "165727"
>> >>     "4"    "165619"
>> >>     "5"    "165749"
>> >>     "6"    "98042"
>> >>
>> >> So 6 is created less often as 0 - 5, but that is in my use case
>not
>> >a
>> >> problem.
>> >>
>> >> This worked for me because I have a big table CPUUsage. But if I
>> >would not
>> >> have, is there another way to to do this?
>> >>
>> >
>> >By the way better select is:
>> >    SELECT Randomiser
>> >    ,      COUNT(*)   AS Count
>> >    FROM (
>> >        SELECT   date
>> >        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER)
>AS
>> >Randomiser
>> >        FROM     CPUUsage
>> >        ORDER BY date
>> >    )
>> >    GROUP BY Randomiser
>> >    ORDER BY Randomiser
>> >
>> >This gives:
>> >    "0"    "156204"
>> >    "1"    "157032"
>> >    "2"    "155636"
>> >    "3"    "156399"
>> >    "4"    "156256"
>> >    "5"    "155480"
>> >    "6"    "156073"
>> >    "7"    "52"
>> >
>> >This is much better. Only very rarely you get a seven you do not
>> >want.
>> >(Again in my case not really a problem.)
>> >
>> >Because in my case I use Randomiser to get a small subset of the
>> >records,
>> >this can be solved with:
>> >    SELECT Randomiser
>> >    ,      COUNT(*)   AS Count
>> >    FROM (
>> >        SELECT   date
>> >        ,        CAST(ABS(RANDOM()) / (9.223E18 / 7) AS INTEGER)
>AS
>> >Randomiser
>> >        FROM     CPUUsage
>> >        ORDER BY date
>> >    )
>> >    WHERE    Randomiser <> 7
>> >    GROUP BY Randomiser
>> >    ORDER BY Randomiser
>> >​
>> >
>> >​Then I get something like:
>> >    "0"    "155806"
>> >    "1"    "156270"
>> >    "2"    "156473"
>> >    "3"    "155748"
>> >    "4"    "155828"
>> >    "5"    "156196"
>> >    "6"    "156733"​
>>
>
>--
>Cecil Westerhof
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Cecil Westerhof-5
2018-07-08 9:10 GMT+02:00 Keith Medcalf <[hidden email]>:

>
> sqlite>
> ​​
> select kurt(abs(random() % 7)) from generate_series where start=1 and
> stop=1e6;
> -1.25154453962449
>
> sqlite> select skew(abs(random() % 7)) from generate_series where start=1
> and stop=1e6;
> 0.00104535938599554
>
> The PRNG is pretty random.
>
> It is slightly concave (that is, anti-normal) (a "flat" distribution would
> have a kurtosis of -1.2) and the curve is slightly skewed above the average.
>

​At the moment it does not say much to me. But when I try the first
statement I get​:
    Error: no such table: generate_series

Am I overlooking something?

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Keith Medcalf

You probably do not have the kurtosis or skew aggregate functions either.  

generate_series is the series.c extension.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Cecil Westerhof
>Sent: Sunday, 8 July, 2018 02:16
>To: SQLite mailing list
>Subject: Re: [sqlite] Kind of pivot table
>
>2018-07-08 9:10 GMT+02:00 Keith Medcalf <[hidden email]>:
>
>>
>> sqlite>
>> ​
>> select kurt(abs(random() % 7)) from generate_series where start=1
>and
>> stop=1e6;
>> -1.25154453962449
>>
>> sqlite> select skew(abs(random() % 7)) from generate_series where
>start=1
>> and stop=1e6;
>> 0.00104535938599554
>>
>> The PRNG is pretty random.
>>
>> It is slightly concave (that is, anti-normal) (a "flat"
>distribution would
>> have a kurtosis of -1.2) and the curve is slightly skewed above the
>average.
>>
>
>​At the moment it does not say much to me. But when I try the first
>statement I get​:
>    Error: no such table: generate_series
>
>Am I overlooking something?
>
>--
>Cecil Westerhof
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Cecil Westerhof-5
2018-07-08 10:20 GMT+02:00 Keith Medcalf <[hidden email]>:

>
> You probably do not have the kurtosis or skew aggregate functions either.
>
> generate_series is the series.c extension.
>

​OK, thanks. Something to look into at a later moment.



> >-----Original Message-----
> >From: sqlite-users [mailto:sqlite-users-
> >[hidden email]] On Behalf Of Cecil Westerhof
> >Sent: Sunday, 8 July, 2018 02:16
> >To: SQLite mailing list
> >Subject: Re: [sqlite] Kind of pivot table
> >
> >2018-07-08 9:10 GMT+02:00 Keith Medcalf <[hidden email]>:
> >
> >>
> >> sqlite>
> >> ​
> >> select kurt(abs(random() % 7)) from generate_series where start=1
> >and
> >> stop=1e6;
> >> -1.25154453962449
> >>
> >> sqlite> select skew(abs(random() % 7)) from generate_series where
> >start=1
> >> and stop=1e6;
> >> 0.00104535938599554
> >>
> >> The PRNG is pretty random.
> >>
> >> It is slightly concave (that is, anti-normal) (a "flat"
> >distribution would
> >> have a kurtosis of -1.2) and the curve is slightly skewed above the
> >average.
> >>
> >
> >​At the moment it does not say much to me. But when I try the first
> >statement I get​:
> >    Error: no such table: generate_series
> >
> >Am I overlooking something?
>

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Simon Slavin-3
In reply to this post by Cecil Westerhof-5
On 8 Jul 2018, at 7:19am, Cecil Westerhof <[hidden email]> wrote:

> ABS(RANDOM()) / CAST(1.4E18 AS INTEGER)

Don't divide by a number like 1.4e18 in an 18-digit INTEGER calculation and expect to get sensible answers.

Use the conventional

    CAST(RANDOM() * 7) AS INTEGER

or whatever spread you want.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Simon Slavin-3


On 8 Jul 2018, at 9:40am, Simon Slavin <[hidden email]> wrote:

> Use the conventional
>
>    CAST(RANDOM() * 7) AS INTEGER
>
> or whatever spread you want.

Apologies.  In SQLite this should be

    (RANDOM() % 7)

instead.  The percent sign is a 'modulo' operator.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Cecil Westerhof-5
In reply to this post by Cecil Westerhof-5
2018-07-08 8:19 GMT+02:00 Cecil Westerhof <[hidden email]>:

> I thought there was a problem with RANDOM. I used:
>     ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
>
> And it seemed I got a lot of threes.
>
> To check this I used:
>     SELECT Randomiser
>     ,      COUNT(*)   AS Count
>     FROM (
>         SELECT   date
>         ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
>         FROM     CPUUsage
>         ORDER BY date
>     )
>     GROUP BY Randomiser
>     ORDER BY Randomiser
>
> And this gave results like:
>     "0"    "165491"
>     "1"    "166270"
>     "2"    "166207"
>     "3"    "165727"
>     "4"    "165619"
>     "5"    "165749"
>     "6"    "98042"
>
> So 6 is created less often as 0 - 5, but that is in my use case not a
> problem.
>
> This worked for me because I have a big table CPUUsage. But if I would not
> have, is there another way to to do this?
>

​Solved it. Do not need a table any-more and generate Randomiser in the
correct way:

​    SELECT Randomiser
    ,      COUNT(*)   AS Count
    FROM (
        WITH RECURSIVE
        cnt(x) AS (
            SELECT 1
            UNION  ALL
            SELECT x + 1 FROM cnt
            LIMIT  1.1E6
        )
        SELECT   x
        ,        ABS(RANDOM()) % 7 AS Randomiser
        FROM     cnt
        ORDER BY x
    )
    GROUP BY Randomiser
    ORDER BY Randomiser

And this gives:
    "0"    "157139"
    "1"    "157865"
    "2"    "156849"
    "3"    "157226"
    "4"    "156916"
    "5"    "157230"
    "6"    "156775"

By the way: it is only slightly faster as the version where I used the
table.​


--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Cecil Westerhof-5
2018-07-08 11:00 GMT+02:00 Cecil Westerhof <[hidden email]>:

> 2018-07-08 8:19 GMT+02:00 Cecil Westerhof <[hidden email]>:
>
>> I thought there was a problem with RANDOM. I used:
>>     ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
>>
>> And it seemed I got a lot of threes.
>>
>> To check this I used:
>>     SELECT Randomiser
>>     ,      COUNT(*)   AS Count
>>     FROM (
>>         SELECT   date
>>         ,        ABS(RANDOM()) / CAST(1.4E18 AS INTEGER) AS Randomiser
>>         FROM     CPUUsage
>>         ORDER BY date
>>     )
>>     GROUP BY Randomiser
>>     ORDER BY Randomiser
>>
>> And this gave results like:
>>     "0"    "165491"
>>     "1"    "166270"
>>     "2"    "166207"
>>     "3"    "165727"
>>     "4"    "165619"
>>     "5"    "165749"
>>     "6"    "98042"
>>
>> So 6 is created less often as 0 - 5, but that is in my use case not a
>> problem.
>>
>> This worked for me because I have a big table CPUUsage. But if I would
>> not have, is there another way to to do this?
>>
>
> ​Solved it. Do not need a table any-more and generate Randomiser in the
> correct way:
> ​
> ​    SELECT Randomiser
>     ,      COUNT(*)   AS Count
>     FROM (
>         WITH RECURSIVE
>         cnt(x) AS (
>             SELECT 1
>             UNION  ALL
>             SELECT x + 1 FROM cnt
>             LIMIT  1.1E6
>         )
>         SELECT   x
>         ,        ABS(RANDOM()) % 7 AS Randomiser
>         FROM     cnt
>         ORDER BY x
>     )
>     GROUP BY Randomiser
>     ORDER BY Randomiser
>
> And this gives:
>     "0"    "157139"
>     "1"    "157865"
>     "2"    "156849"
>     "3"    "157226"
>     "4"    "156916"
>     "5"    "157230"
>     "6"    "156775"
>
> By the way: it is only slightly faster as the version where I used the
> table.​
>

​I can shave off about 30% by rewriting it to:
    WITH RECURSIVE
    cnt(x) AS (
        SELECT 1
        UNION  ALL
        SELECT x + 1 FROM cnt
        LIMIT  1.1E6
    )
    SELECT   ABS(RANDOM()) % 7 AS Randomiser
    ,        COUNT(*)
    FROM     cnt
    GROUP BY Randomiser
    ORDER BY Randomiser



--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Jens Alfke-2
In reply to this post by Keith Medcalf

> On Jul 7, 2018, at 11:49 PM, Keith Medcalf <[hidden email]> wrote:
>
> Why not use MOD (%) as in
>
> ABS(RANDOM() % 6)

Because modulo bias. If the RHS doesn't evenly divide the size of the range of RANDOM(), some results will be slightly more likely than others.

https://zuttobenkyou.wordpress.com/2012/10/18/generating-random-numbers-without-modulo-bias/ <https://zuttobenkyou.wordpress.com/2012/10/18/generating-random-numbers-without-modulo-bias/>

Of course in this case, where the range of RANDOM() is probably 2^32, the bias will be very small. But in some use cases the bias can be magnified by subsequent operations and can still skew the eventual result. A Google search turns up several hits that promise fairer algorithms.

https://www.google.com/search?client=safari&rls=en&q=random+modulo+bias&ie=UTF-8&oe=UTF-8 <https://www.google.com/search?client=safari&rls=en&q=random+modulo+bias&ie=UTF-8&oe=UTF-8>

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Richard Damon
On 7/9/18 12:48 PM, Jens Alfke wrote:

>> On Jul 7, 2018, at 11:49 PM, Keith Medcalf <[hidden email]> wrote:
>>
>> Why not use MOD (%) as in
>>
>> ABS(RANDOM() % 6)
> Because modulo bias. If the RHS doesn't evenly divide the size of the range of RANDOM(), some results will be slightly more likely than others.
>
> https://zuttobenkyou.wordpress.com/2012/10/18/generating-random-numbers-without-modulo-bias/ <https://zuttobenkyou.wordpress.com/2012/10/18/generating-random-numbers-without-modulo-bias/>
>
> Of course in this case, where the range of RANDOM() is probably 2^32, the bias will be very small. But in some use cases the bias can be magnified by subsequent operations and can still skew the eventual result. A Google search turns up several hits that promise fairer algorithms.
>
> https://www.google.com/search?client=safari&rls=en&q=random+modulo+bias&ie=UTF-8&oe=UTF-8 <https://www.google.com/search?client=safari&rls=en&q=random+modulo+bias&ie=UTF-8&oe=UTF-8>
>
> —Jens
But ANY random number generator that generates a finite number of
results is almost certainly going to have a similar bias, as you divide
up those numbers into bins. To get rid of the bias you generally need to
throw out some values so you have an equal number in each bin (unless
you luck out and start having an even number in each bin). Your second
solution did this. If your first solution was good enough, so should
have MOD.

--
Richard Damon

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Kind of pivot table

Keith Medcalf
In reply to this post by Jens Alfke-2

cast(abs(random()) / 9223372036854775807.0 * 7 as integer)

Will fix that.  You throw away one bit of randomness (by the abs()), convert to a floating point value in the range of 0 but less than 1 (throwing away a few more bits in the conversion (since the precision of a double mantissa is only 53 bits, not 63)), divide that into 7 buckets, and take the floor of that as an integer result.  This will have a uniformity equal to the underlying PRNG.

The bias inherent in a direct mod operation is insignificant when you are using 64 bits of randomness each time ... the non-uniformity of the PRNG outweighs the bias of the modulus calculation at least for small values of the modulus.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Jens Alfke
>Sent: Monday, 9 July, 2018 10:49
>To: SQLite mailing list
>Subject: Re: [sqlite] Kind of pivot table
>
>
>> On Jul 7, 2018, at 11:49 PM, Keith Medcalf <[hidden email]>
>wrote:
>>
>> Why not use MOD (%) as in
>>
>> ABS(RANDOM() % 6)
>
>Because modulo bias. If the RHS doesn't evenly divide the size of the
>range of RANDOM(), some results will be slightly more likely than
>others.
>
>https://zuttobenkyou.wordpress.com/2012/10/18/generating-random-
>numbers-without-modulo-bias/
><https://zuttobenkyou.wordpress.com/2012/10/18/generating-random-
>numbers-without-modulo-bias/>
>
>Of course in this case, where the range of RANDOM() is probably 2^32,
>the bias will be very small. But in some use cases the bias can be
>magnified by subsequent operations and can still skew the eventual
>result. A Google search turns up several hits that promise fairer
>algorithms.
>
>https://www.google.com/search?client=safari&rls=en&q=random+modulo+bi
>as&ie=UTF-8&oe=UTF-8
><https://www.google.com/search?client=safari&rls=en&q=random+modulo+b
>ias&ie=UTF-8&oe=UTF-8>
>
>—Jens
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users