Atoi64 bug(s)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Atoi64 bug(s)

Cezary H. Noweta
Hello,

About year age I reported some strange behavior:

1.

https://www.sqlite.org/lang_expr.html#castexpr:

INTEGER: ``When casting a TEXT value to INTEGER, the longest possible
prefix of the value that can be interpreted as an integer number is
extracted from the TEXT value and the remainder ignored. [...] If there
is no prefix that can be interpreted as an integer number, the result of
the conversion is 0.''

sqlite> SELECT CAST('12345678901234567890123' AS INTEGER);
4807115922877859019

What prefix of ``12345678901234567890123'' can be interpreted as
``4807115922877859019''?

2.

Why trailing spaces are allowed in FLOATs and not in INTEGERs? It would
not be a problem, if INTs were 32bit. Now, a value is lost:

sqlite> CREATE TABLE a (a NUMERIC);
sqlite> INSERT INTO a VALUES
('9000000000000000001'),('9000000000000000001 ');
sqlite> SELECT * FROM a;
9000000000000000001
9000000000000000000

2a.

https://www.sqlite.org/lang_expr.html#castexpr:

NUMERIC: ``Casting a TEXT or BLOB value into NUMERIC first does a forced
conversion into REAL but then further converts the result into INTEGER
if and only if the conversion from REAL to INTEGER is lossless and
reversible.''

Why a cast to universal AFFINITY (NUMERIC -- designed to represent both
INTEGERs as well as FLOATs) could be skewed into FLOAT direction loosing
a INTEGER's LSBits? INT is not 32bit wide. Hopefully above is not true:

sqlite> SELECT CAST('9000000000000000001' AS NUMERIC);
9000000000000000001

However due to pt 2.:

sqlite> SELECT CAST('9000000000000000001 ' AS NUMERIC);
9000000000000000000

The most concise patch (without, for example ``SELECT
CAST('9000000000000000001X' AS NUMERIC);'') contains only two lines:
======
--- sqlite-src-3220000/src/util.c 2018-01-23 01:57:26.000000000 +0100
+++ sqlite-src-3220000/src/util.c 2018-01-25 14:22:18.428460300 +0100
@@ -625,6 +625,7 @@
      zNum += (enc&1);
    }
    while( zNum<zEnd && sqlite3Isspace(*zNum) ) zNum+=incr;
+  while( zNum<zEnd && sqlite3Isspace(zEnd[-incr]) ) zEnd-=incr;
    if( zNum<zEnd ){
      if( *zNum=='-' ){
        neg = 1;
@@ -638,7 +639,7 @@
    for(i=0; &zNum[i]<zEnd && (c=zNum[i])>='0' && c<='9'; i+=incr){
      u = u*10 + c - '0';
    }
-  if( u>LARGEST_INT64 ){
+  if( 19 < i || u>LARGEST_INT64 ){
      *pNum = neg ? SMALLEST_INT64 : LARGEST_INT64;
    }else if( neg ){
      *pNum = -(i64)u;
======

-- best regards

Cezary H. Noweta
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

Stephen Chrzanowski
RE the first question, the largest positive 64-bit number is
18446744073709551615, or, 18,446,744,073,709,551,615.  Your number is
overloading that value.
12,345,678,901,234,567,890,123
vs
18,446,744,073,709,551,615

AFAIK, SQLite uses max of 64-bit integer math, not 128.


On Thu, Jan 25, 2018 at 10:04 AM, Cezary H. Noweta <[hidden email]>
wrote:

> Hello,
>
> About year age I reported some strange behavior:
>
> 1.
>
> https://www.sqlite.org/lang_expr.html#castexpr:
>
> INTEGER: ``When casting a TEXT value to INTEGER, the longest possible
> prefix of the value that can be interpreted as an integer number is
> extracted from the TEXT value and the remainder ignored. [...] If there is
> no prefix that can be interpreted as an integer number, the result of the
> conversion is 0.''
>
> sqlite> SELECT CAST('12345678901234567890123' AS INTEGER);
> 4807115922877859019
>
> What prefix of ``12345678901234567890123'' can be interpreted as
> ``4807115922877859019''?
>
> 2.
>
> Why trailing spaces are allowed in FLOATs and not in INTEGERs? It would
> not be a problem, if INTs were 32bit. Now, a value is lost:
>
> sqlite> CREATE TABLE a (a NUMERIC);
> sqlite> INSERT INTO a VALUES ('9000000000000000001'),('9000000000000000001
> ');
> sqlite> SELECT * FROM a;
> 9000000000000000001
> 9000000000000000000
>
> 2a.
>
> https://www.sqlite.org/lang_expr.html#castexpr:
>
> NUMERIC: ``Casting a TEXT or BLOB value into NUMERIC first does a forced
> conversion into REAL but then further converts the result into INTEGER if
> and only if the conversion from REAL to INTEGER is lossless and
> reversible.''
>
> Why a cast to universal AFFINITY (NUMERIC -- designed to represent both
> INTEGERs as well as FLOATs) could be skewed into FLOAT direction loosing a
> INTEGER's LSBits? INT is not 32bit wide. Hopefully above is not true:
>
> sqlite> SELECT CAST('9000000000000000001' AS NUMERIC);
> 9000000000000000001
>
> However due to pt 2.:
>
> sqlite> SELECT CAST('9000000000000000001 ' AS NUMERIC);
> 9000000000000000000
>
> The most concise patch (without, for example ``SELECT
> CAST('9000000000000000001X' AS NUMERIC);'') contains only two lines:
> ======
> --- sqlite-src-3220000/src/util.c       2018-01-23 01:57:26.000000000 +0100
> +++ sqlite-src-3220000/src/util.c       2018-01-25 14:22:18.428460300 +0100
> @@ -625,6 +625,7 @@
>      zNum += (enc&1);
>    }
>    while( zNum<zEnd && sqlite3Isspace(*zNum) ) zNum+=incr;
> +  while( zNum<zEnd && sqlite3Isspace(zEnd[-incr]) ) zEnd-=incr;
>    if( zNum<zEnd ){
>      if( *zNum=='-' ){
>        neg = 1;
> @@ -638,7 +639,7 @@
>    for(i=0; &zNum[i]<zEnd && (c=zNum[i])>='0' && c<='9'; i+=incr){
>      u = u*10 + c - '0';
>    }
> -  if( u>LARGEST_INT64 ){
> +  if( 19 < i || u>LARGEST_INT64 ){
>      *pNum = neg ? SMALLEST_INT64 : LARGEST_INT64;
>    }else if( neg ){
>      *pNum = -(i64)u;
> ======
>
> -- best regards
>
> Cezary H. Noweta
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

petern
Confirmed.
sqlite> .version
SQLite 3.22.0 2018-01-22 18:45:57
0c55d179733b46d8d0ba4d88e01a25e10677046ee3da1d5b1581e86726f2alt1
zlib version 1.2.8
gcc-4.8.4

--FYI some background:
------------------------------

--min 64b signed int:
SELECT CAST(0x8000000000000000 AS INTEGER);
-- "CAST(0x8000000000000000 AS INTEGER)"
-- -9223372036854775808

--max 64b signed int:
SELECT CAST(0x7FFFFFFFFFFFFFFF AS INTEGER);
-- "CAST(0x7FFFFFFFFFFFFFFF AS INTEGER)"
-- 9223372036854775807

-- Note how place counts align with numeric example:

-- 9223372036854775807
-- 9000000000000000001

-- I reproduce problem with spurious CAST due to trailing zero as follows:
-------------------------------------------------------------------------------------------

CREATE TABLE IF NOT EXISTS a (a INTEGER);
INSERT INTO a VALUES ('9000000000000000001'),('9000000000000000001 ');
SELECT * FROM a;
-- a
-- 9000000000000000001
-- 9000000000000000000

-- Note however, the constant table expression works fine...

SELECT CAST(column1 AS INTEGER) FROM (VALUES
('9000000000000000001'),('9000000000000000001 '));
-- "CAST(column1 AS INTEGER)"
-- 9000000000000000001
-- 9000000000000000001

-------------------------------------------------------------------------------------------
Peter














On Thu, Jan 25, 2018 at 9:14 AM, Stephen Chrzanowski <[hidden email]>
wrote:

> RE the first question, the largest positive 64-bit number is
> 18446744073709551615, or, 18,446,744,073,709,551,615.  Your number is
> overloading that value.
> 12,345,678,901,234,567,890,123
> vs
> 18,446,744,073,709,551,615
>
> AFAIK, SQLite uses max of 64-bit integer math, not 128.
>
>
> On Thu, Jan 25, 2018 at 10:04 AM, Cezary H. Noweta <[hidden email]>
> wrote:
>
> > Hello,
> >
> > About year age I reported some strange behavior:
> >
> > 1.
> >
> > https://www.sqlite.org/lang_expr.html#castexpr:
> >
> > INTEGER: ``When casting a TEXT value to INTEGER, the longest possible
> > prefix of the value that can be interpreted as an integer number is
> > extracted from the TEXT value and the remainder ignored. [...] If there
> is
> > no prefix that can be interpreted as an integer number, the result of the
> > conversion is 0.''
> >
> > sqlite> SELECT CAST('12345678901234567890123' AS INTEGER);
> > 4807115922877859019
> >
> > What prefix of ``12345678901234567890123'' can be interpreted as
> > ``4807115922877859019''?
> >
> > 2.
> >
> > Why trailing spaces are allowed in FLOATs and not in INTEGERs? It would
> > not be a problem, if INTs were 32bit. Now, a value is lost:
> >
> > sqlite> CREATE TABLE a (a NUMERIC);
> > sqlite> INSERT INTO a VALUES ('9000000000000000001'),('
> 9000000000000000001
> > ');
> > sqlite> SELECT * FROM a;
> > 9000000000000000001
> > 9000000000000000000
> >
> > 2a.
> >
> > https://www.sqlite.org/lang_expr.html#castexpr:
> >
> > NUMERIC: ``Casting a TEXT or BLOB value into NUMERIC first does a forced
> > conversion into REAL but then further converts the result into INTEGER if
> > and only if the conversion from REAL to INTEGER is lossless and
> > reversible.''
> >
> > Why a cast to universal AFFINITY (NUMERIC -- designed to represent both
> > INTEGERs as well as FLOATs) could be skewed into FLOAT direction loosing
> a
> > INTEGER's LSBits? INT is not 32bit wide. Hopefully above is not true:
> >
> > sqlite> SELECT CAST('9000000000000000001' AS NUMERIC);
> > 9000000000000000001
> >
> > However due to pt 2.:
> >
> > sqlite> SELECT CAST('9000000000000000001 ' AS NUMERIC);
> > 9000000000000000000
> >
> > The most concise patch (without, for example ``SELECT
> > CAST('9000000000000000001X' AS NUMERIC);'') contains only two lines:
> > ======
> > --- sqlite-src-3220000/src/util.c       2018-01-23 01:57:26.000000000
> +0100
> > +++ sqlite-src-3220000/src/util.c       2018-01-25 14:22:18.428460300
> +0100
> > @@ -625,6 +625,7 @@
> >      zNum += (enc&1);
> >    }
> >    while( zNum<zEnd && sqlite3Isspace(*zNum) ) zNum+=incr;
> > +  while( zNum<zEnd && sqlite3Isspace(zEnd[-incr]) ) zEnd-=incr;
> >    if( zNum<zEnd ){
> >      if( *zNum=='-' ){
> >        neg = 1;
> > @@ -638,7 +639,7 @@
> >    for(i=0; &zNum[i]<zEnd && (c=zNum[i])>='0' && c<='9'; i+=incr){
> >      u = u*10 + c - '0';
> >    }
> > -  if( u>LARGEST_INT64 ){
> > +  if( 19 < i || u>LARGEST_INT64 ){
> >      *pNum = neg ? SMALLEST_INT64 : LARGEST_INT64;
> >    }else if( neg ){
> >      *pNum = -(i64)u;
> > ======
> >
> > -- best regards
> >
> > Cezary H. Noweta
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

Cezary H. Noweta
Hello,

On 2018-01-25 19:54, petern wrote:
> CREATE TABLE IF NOT EXISTS a (a INTEGER);

INTEGER == NUMERIC in case of column declarations.

> -- Note however, the constant table expression works fine...
>
> SELECT CAST(column1 AS INTEGER) FROM (VALUES
> ('9000000000000000001'),('9000000000000000001 '));
> -- "CAST(column1 AS INTEGER)"
> -- 9000000000000000001
> -- 9000000000000000001

This is due a fact that ``CAST AS INTEGER'' blindly calls ``Atoi64()''
and returns its result.

INTEGER/FLOAT handling/recognition is a bit more complicated in other
places, what causes that '9000000000000000001 ' will become 9e18.

For the same reason ``CAST ('9000000000000000001X' AS INTEGER)'' gives
INT 9000000000000000001, while ``SELECT CAST ('9000000000000000001X' AS
NUMERIC);'' gives FLOAT 9e18.

Due to a bit disordered treatment of values, my own patch involves many
changes. The ``concise'' patch, which I proposed in my original post,
eliminates: (1) (mod 2^64) bug, and (2) an erratic treatment of INTs and
FLOATs in some (not all) places. It changes only one line and adds one,
and does not change affinity/type system at all. (As opposed to my
``long'' version patch).

1. IMHO (mod 2^64) bug is serious. Speculative example: my app has a
dangerous number ``1234'' and is checking input text against it; then
``295147905179352827090'' -- OK, go on -- says my app; then suddenly
``CAST('295147905179352827090' AS INTEGER)'' ==> BOOM: 1234.

2. STRING to be INT must be reacting to ``attention'' command
immediately (even small, chaste space at the end will bother); to be
FLOAT, it can carousel from dusk till next dusk all the time.

There was no noticeable side effects (besides a performance) in old
32bit INT days, however now, 10 bits of each INT can be going to a vacuum.

-- best regards

Cezary H. Noweta
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

Stephen Chrzanowski
According to https://www.sqlite.org/datatype3.html










*2. Storage Classes and DatatypesEach value stored in an SQLite database
(or manipulated by the database engine) has one of the following storage
classes:    NULL. The value is a NULL value.    INTEGER. The value is a
signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the
magnitude of the value.    REAL. The value is a floating point value,
stored as an 8-byte IEEE floating point number.    TEXT. The value is a
text string, stored using the database encoding (UTF-8, UTF-16BE or
UTF-16LE).    BLOB. The value is a blob of data, stored exactly as it was
input.*

*Each column in an SQLite 3 database is assigned one of the following type
affinities:*

   - *TEXT*
   - *NUMERIC*
   - *INTEGER*
   - *REAL*
   - *BLOB*

*...*

*A column with NUMERIC affinity may contain values using all five storage
classes. When text data is inserted into a NUMERIC column, the storage
class of the text is converted to INTEGER or REAL (in order of preference)
if such conversion is lossless and reversible.*

*...*

My understanding of this is that since you're using strings as integer
literals, its going to use integer calculations, and not go back to using
"REAL" calculations as you'll be losing information.  According to
http://en.cppreference.com/w/cpp/language/types if the engine was able to
predetermine that your string (Which looks like an integer, not a REAL)
it'd have more than a just over (1.7 * 10^308) but you're going to lose a
lot of accuracy as the IEEE specs state that there's 15 digit accuracy, and
you're asking for 23.  You can't just stuff that many numbers in a 64-bit
number and maintain accuracy, plus, you're looking at data loss since
you're going from a 23 digit number to a maximum of 15.  So according to
documentation, this is working as intended.  IMO, the query you gave should
FAULT as its an overrun and what you're asking it to do is impossible at a
64-bit integer level.

To test, I ran this:

C:\Users\schrzanowski.YKF>sqlite3
SQLite version 3.20.0 2017-08-01 13:24:15
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> SELECT CAST('12345678901234567890123' AS INTEGER);
4807115922877859019
sqlite> SELECT CAST('12345678901234567890123.0' AS INTEGER);
4807115922877859019
sqlite> SELECT CAST('12345678901234567890123.0' AS REAL);
1.23456789012346e+22
sqlite> SELECT CAST('12345678901234567890123' AS REAL);
1.23456789012346e+22


I'm not a developer of SQLite3 by any stretch of the imagination, so I
can't say for certain whether your patch is going to be accepted or not.
In my view, I've never had to deal with numbers that large, and I suspect
this is going to end up being looked at as an edge case.

On Thu, Jan 25, 2018 at 3:36 PM, Cezary H. Noweta <[hidden email]>
wrote:

> Hello,
>
> On 2018-01-25 19:54, petern wrote:
>
>> CREATE TABLE IF NOT EXISTS a (a INTEGER);
>>
>
> INTEGER == NUMERIC in case of column declarations.
>
> -- Note however, the constant table expression works fine...
>>
>> SELECT CAST(column1 AS INTEGER) FROM (VALUES
>> ('9000000000000000001'),('9000000000000000001 '));
>> -- "CAST(column1 AS INTEGER)"
>> -- 9000000000000000001
>> -- 9000000000000000001
>>
>
> This is due a fact that ``CAST AS INTEGER'' blindly calls ``Atoi64()'' and
> returns its result.
>
> INTEGER/FLOAT handling/recognition is a bit more complicated in other
> places, what causes that '9000000000000000001 ' will become 9e18.
>
> For the same reason ``CAST ('9000000000000000001X' AS INTEGER)'' gives INT
> 9000000000000000001, while ``SELECT CAST ('9000000000000000001X' AS
> NUMERIC);'' gives FLOAT 9e18.
>
> Due to a bit disordered treatment of values, my own patch involves many
> changes. The ``concise'' patch, which I proposed in my original post,
> eliminates: (1) (mod 2^64) bug, and (2) an erratic treatment of INTs and
> FLOATs in some (not all) places. It changes only one line and adds one, and
> does not change affinity/type system at all. (As opposed to my ``long''
> version patch).
>
> 1. IMHO (mod 2^64) bug is serious. Speculative example: my app has a
> dangerous number ``1234'' and is checking input text against it; then
> ``295147905179352827090'' -- OK, go on -- says my app; then suddenly
> ``CAST('295147905179352827090' AS INTEGER)'' ==> BOOM: 1234.
>
> 2. STRING to be INT must be reacting to ``attention'' command immediately
> (even small, chaste space at the end will bother); to be FLOAT, it can
> carousel from dusk till next dusk all the time.
>
> There was no noticeable side effects (besides a performance) in old 32bit
> INT days, however now, 10 bits of each INT can be going to a vacuum.
>
>
> -- best regards
>
> Cezary H. Noweta
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

petern
In reply to this post by Cezary H. Noweta
Thank you for expanding on your detailed observations.
If you can, please post the long patch at your customary patch site
http://sqlite.chncc.eu/
Also, in the link text, please make note of the exact SQLite version the
patch is for.

The lesson I draw is that unconditionally correct queries must check the
input type for all id range and offset computations.
So unfortunately, if the query depends on id input, it must be wrapped by
inefficient CASE typeof() protection statements.
On the bright side, INT overflow is at least detectable:

.mode line
WITH id AS (SELECT (0x7FFFFFFFFFFFFFFF)id)  SELECT
id,typeof(id),id+1,typeof(id+1) FROM id;
-- id = 9223372036854775807
-- typeof(id) = integer
-- id+1 = 9.22337203685478e+18
-- typeof(id+1) = real

Peter

On Thu, Jan 25, 2018 at 12:36 PM, Cezary H. Noweta <[hidden email]>
wrote:

> Hello,
>
> On 2018-01-25 19:54, petern wrote:
>
>> CREATE TABLE IF NOT EXISTS a (a INTEGER);
>>
>
> INTEGER == NUMERIC in case of column declarations.
>
> -- Note however, the constant table expression works fine...
>>
>> SELECT CAST(column1 AS INTEGER) FROM (VALUES
>> ('9000000000000000001'),('9000000000000000001 '));
>> -- "CAST(column1 AS INTEGER)"
>> -- 9000000000000000001
>> -- 9000000000000000001
>>
>
> This is due a fact that ``CAST AS INTEGER'' blindly calls ``Atoi64()'' and
> returns its result.
>
> INTEGER/FLOAT handling/recognition is a bit more complicated in other
> places, what causes that '9000000000000000001 ' will become 9e18.
>
> For the same reason ``CAST ('9000000000000000001X' AS INTEGER)'' gives INT
> 9000000000000000001, while ``SELECT CAST ('9000000000000000001X' AS
> NUMERIC);'' gives FLOAT 9e18.
>
> Due to a bit disordered treatment of values, my own patch involves many
> changes. The ``concise'' patch, which I proposed in my original post,
> eliminates: (1) (mod 2^64) bug, and (2) an erratic treatment of INTs and
> FLOATs in some (not all) places. It changes only one line and adds one, and
> does not change affinity/type system at all. (As opposed to my ``long''
> version patch).
>
> 1. IMHO (mod 2^64) bug is serious. Speculative example: my app has a
> dangerous number ``1234'' and is checking input text against it; then
> ``295147905179352827090'' -- OK, go on -- says my app; then suddenly
> ``CAST('295147905179352827090' AS INTEGER)'' ==> BOOM: 1234.
>
> 2. STRING to be INT must be reacting to ``attention'' command immediately
> (even small, chaste space at the end will bother); to be FLOAT, it can
> carousel from dusk till next dusk all the time.
>
> There was no noticeable side effects (besides a performance) in old 32bit
> INT days, however now, 10 bits of each INT can be going to a vacuum.
>
> -- best regards
>
> Cezary H. Noweta
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

Cezary H. Noweta
In reply to this post by Stephen Chrzanowski
Hello,

On 2018-01-25 22:08, Stephen Chrzanowski wrote:

> According to https://www.sqlite.org/datatype3.html
>
> *2. Storage Classes and DatatypesEach value stored in an SQLite database
> (or manipulated by the database engine) has one of the following storage
> classes:    NULL. The value is a NULL value.    INTEGER. The value is a
> signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the
> magnitude of the value.    REAL. The value is a floating point value,
> stored as an 8-byte IEEE floating point number.    TEXT. The value is a
> text string, stored using the database encoding (UTF-8, UTF-16BE or
> UTF-16LE).    BLOB. The value is a blob of data, stored exactly as it was
> input.*
>
> *Each column in an SQLite 3 database is assigned one of the following type
> affinities:*
>
>     - *TEXT*
>     - *NUMERIC*
>     - *INTEGER*
>     - *REAL*
>     - *BLOB*
>
> *...*
>
> *A column with NUMERIC affinity may contain values using all five storage
> classes. When text data is inserted into a NUMERIC column, the storage
> class of the text is converted to INTEGER or REAL (in order of preference)
> if such conversion is lossless and reversible.*
>
> *...*

3 paragraphs.

> My understanding of this is that since you're using strings as integer
> literals, its going to use integer calculations, and not go back to using
> "REAL" calculations as you'll be losing information.  According to
> http://en.cppreference.com/w/cpp/language/types if the engine was able to
> predetermine that your string (Which looks like an integer, not a REAL)
> it'd have more than a just over (1.7 * 10^308) but you're going to lose a
> lot of accuracy as the IEEE specs state that there's 15 digit accuracy, and
> you're asking for 23.  You can't just stuff that many numbers in a 64-bit
> number and maintain accuracy, plus, you're looking at data loss since
> you're going from a 23 digit number to a maximum of 15.  So according to
> documentation, this is working as intended.  IMO, the query you gave should
> FAULT as its an overrun and what you're asking it to do is impossible at a
> 64-bit integer level.

For me, it looks like an attorney's speech in a court. From 3 paragraphs
going more or less around a problem you draw conclusions which are
contradictory to the explicit documentation's statement
(https://www.sqlite.org/lang_expr.html#castexpr):

INTEGER: ``When casting a TEXT value to INTEGER, the longest possible
prefix of the value that can be interpreted as an integer number is
extracted from the TEXT value and the remainder ignored. [...] If there
is no prefix that can be interpreted as an integer number, the result of
the conversion is 0.''

> To test, I ran this:
>
> C:\Users\schrzanowski.YKF>sqlite3
> SQLite version 3.20.0 2017-08-01 13:24:15
> Enter ".help" for usage hints.
> Connected to a transient in-memory database.
> Use ".open FILENAME" to reopen on a persistent database.
> sqlite> SELECT CAST('12345678901234567890123' AS INTEGER);
> 4807115922877859019

If this is good, then what prefix of ``12345678901234567890123'' could
be interpreted as ``4807115922877859019''?

> sqlite> SELECT CAST('12345678901234567890123.0' AS INTEGER);
> 4807115922877859019

If this is good, then what prefix of ``12345678901234567890123'' could
be interpreted as ``4807115922877859019''?

> I'm not a developer of SQLite3 by any stretch of the imagination, so I
> can't say for certain whether your patch is going to be accepted or not.
> In my view, I've never had to deal with numbers that large, and I suspect
> this is going to end up being looked at as an edge case.
Ok, to summarize, in your opinion, (a) ``CAST('...' AS INTEGER) => (not
so) quite random number [NSQRN follows]'' and (b) losing of an info
(900...001 => 900...000, etc.) is good (or, at least better then (a) ...
=> {SMALLEST,LARGEST}_INT64; (b) not losing of an info), because:

(a) It is impossible to express some number '112676...878676878' by a
64bit variable, then

(a.1) CAST('toobignum' AS INTEGER) => [NSQRN]) is ok and conforming to
the doc (``working as intended''). CAST('toobignumstring' AS INTEGER) =>
LARGEST_INT64) is not ok and not conforming to the doc (``not working as
intended''), if so, then report that CAST of string numbers
<'9223372036854775809';'18446744073709551615'> AS INTEGER ``is not
working as intended'', or

(a.2) both ways are ``working as intended'', if so, then I proposed a
patch which requires 5 characters and is making a SQLite behave in a
more (IMHO) reasonable way;

(b) I'm sorry - I haven't got your point here. Appending a trailing
space would lose 10 bits because 'toobignumstring' lose accuracy when
converting to FLOAT?? There is other thing then (a). I did not postulate
to convert '4444444444444444444444444444444444444444444444444444' into
exact INT64 value. I agree with you that it is impossible. Simply, I do
not agree that appending a space to a number which _can_ be fully
represented causes that the number needlessly loses an accuracy, and
have proposed the (second part of the) patch which requires copying of
one line of code with less then 20 chars of change, that's all. (Though
it does not solves the problem in full.)

And at the end: an introduction of such small changes is not worth of
price, because you are not using such big numbers and are classifying
the problem as an edge case? Really, is not it? I do not understand why
are you bringing that argument up. Some kind of argument by an authority?

As you said, probably the problem is an edge one. Probably most people
contending with the problem have found more or less trivial solution. I
have just shared my observations of a bit strange behavior and proposed
a solution. What are you trying to say? That [NSQRN] as a result of
conversion of a 'bignumstring' is better that saturated value MIN/MAX.
That needless loss of info without any benefits is good?

-- best regards

Cezary H. Noweta
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

Cezary H. Noweta
In reply to this post by petern
Hello,

On 2018-01-25 22:58, petern wrote:
> Thank you for expanding on your detailed observations.
> If you can, please post the long patch at your customary patch site
> http://sqlite.chncc.eu/
I was convinced that I had publicized my patch already. For the people
who are interested in the patch, please give me a few hours to cut my
new not-so-completely implemented functionalities from my draft version.

-- best regards

Cezary H. Noweta
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

petern
Cezary.  Your short form fix for the spurious NUMERIC CAST due to trailing
space was definitely received in your original posting.

I am interested to see your solution where NUMERIC CAST has sensible
interpretation between MAX_INT and MAX_REAL.
IMO, simple INT saturation is not convenient for easy overflow detection in
SQL.   So there is work to be done where the upcasted number is large but
not quite large enough for REAL saturation. Nearby upcasted INTs must sort
sensibly.

Also, what happens to overflowing hex constants and from BLOB casts?

It is important to curate such patches in case the priority for execution
speed/size cannot be reconciled with accuracy and generality.  If your
improvements make v3.23 slower or larger than v3.22, they may be rejected.
Nevertheless, I think users who prioritize dependability, accuracy, and
generality over slightly degraded executable speed/size will be very
interested to have your long form improvements.

Best regards.
Peter

On Thu, Jan 25, 2018 at 3:15 PM, Cezary H. Noweta <[hidden email]>
wrote:

> Hello,
>
> On 2018-01-25 22:58, petern wrote:
>
>> Thank you for expanding on your detailed observations.
>> If you can, please post the long patch at your customary patch site
>> http://sqlite.chncc.eu/
>>
> I was convinced that I had publicized my patch already. For the people who
> are interested in the patch, please give me a few hours to cut my new
> not-so-completely implemented functionalities from my draft version.
>
> -- best regards
>
> Cezary H. Noweta
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Atoi64 bug(s)

Cezary H. Noweta
Hello,

On 2018-01-26 00:54, petern wrote:
> I am interested to see your solution where NUMERIC CAST has sensible
> interpretation between MAX_INT and MAX_REAL.

The patch focuses on problems with TEXT=>NUM conversions. The sole
exception is INT=>FLOAT, when a value could lose an information.

> IMO, simple INT saturation is not convenient for easy overflow detection in
> SQL.   So there is work to be done where the upcasted number is large but
> not quite large enough for REAL saturation. Nearby upcasted INTs must sort
> sensibly.

Indeed -- INT saturation is for more predictability of INTEGER CAST's
results, when converted from TEXTs.

> Also, what happens to overflowing hex constants and from BLOB casts?

HEX is not touched by CASTs and AFFINITIes (from TEXT).

> It is important to curate such patches in case the priority for execution
> speed/size cannot be reconciled with accuracy and generality.  If your
> improvements make v3.23 slower or larger than v3.22, they may be rejected.

My patch is not to be released as a replacement of SQLite. Speed tests
showed that it is faster (even 10-30%) then SQLite, but only due to a
fact that SQLite until 3.21 did not use an exponentiation by squares --
this is improved in SQLite 3.22. (I have not made the patch for 3.22
yet.) The patch is to illustrate the problem and as a reference.

> Nevertheless, I think users who prioritize dependability, accuracy, and
> generality over slightly degraded executable speed/size will be very
> interested to have your long form improvements.

I'm using few compilers and platforms. There are many surprising
constructs which can be 30% faster on old VS2005 (the last which
produced EXE for Win16 -- aka 98/ME) then on VS2015. Fountain of bugs is
a main problem in Watcom. MSVC has a buggy preprocessor (I have not
tried VS2017 yet) which is grinding all macro from the beginning to the
end, every round until there is nothing to expand. GCC had a minor
problem with ASAN (mentioned on this list), and with a comparison of
doubles<=>int64s (mentioned/fixed recently by DRH), and so on, and so
on. If the patch is to illustrate the problem (if any), I do not want to
obfuscate it and I'm trying to use most universal constructs.

Besides that IMHO the problem (with trailing spaces and loss of
precision) is too complicated to be effectively (and universally)
resolved by someone other (me for sure) then the team.

On the other side, the problem with saturation is completely resolvable
by adding 5 characters to ``Atoi64()''.

-- best regards

Cezary H. Noweta
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users