RE: Problem with floating point fields, and a feature request

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

RE: Problem with floating point fields, and a feature request

Cariotoglou Mike
as you may remember, some time ago I raised an issue with floating point
accuracy, and how this may affect sqlite.

I now have a concrete example, which actually happened in an
installation, and helps to demonstrate the severity of the issue:

try this code:

create table test(f double);
insert into test values(13.04);
update test set f=f+0.02;
select * from test where f=13.06; -- returns no data

can you imagine how many bugs waiting to happen are out there, because
of code like this ?

I know that there are a number of solutions to this problem, all
involving changing the sql involved.
however, these are not applicable to people using SQLITE via wrappers
that generate their own UUPDATE code.

what I have been trying to say is that, there is also a neat solution to
the problem, and one that can be implemented easily :

pragma floating_accuracy=0.0000001

setting the threshold for float comparisons to some predictable value.

Am I the only one that sees the problem? if not, please speak up, and
maybe we can get a neat solution!



Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Arjen Markus
Cariotoglou Mike wrote:

>
> as you may remember, some time ago I raised an issue with floating point
> accuracy, and how this may affect sqlite.
>
> I now have a concrete example, which actually happened in an
> installation, and helps to demonstrate the severity of the issue:
>
> try this code:
>
> create table test(f double);
> insert into test values(13.04);
> update test set f=f+0.02;
> select * from test where f=13.06; -- returns no data
>
> can you imagine how many bugs waiting to happen are out there, because
> of code like this ?
>
> I know that there are a number of solutions to this problem, all
> involving changing the sql involved.
> however, these are not applicable to people using SQLITE via wrappers
> that generate their own UUPDATE code.
>
> what I have been trying to say is that, there is also a neat solution to
> the problem, and one that can be implemented easily :
>
> pragma floating_accuracy=0.0000001
>
> setting the threshold for float comparisons to some predictable value.
>
> Am I the only one that sees the problem? if not, please speak up, and
> maybe we can get a neat solution!

Well, I have come across it in many other contexts and there is no
general solution, unfortunately - well, using decimal floating-point
arithmetic might in some cases solve it (at least give more predictable
results, though not always).

What about the scale? I assume that the above accuracy refers to
the range [value-0.0000001,value+0.0000001]. That would fail with
numbers
smaller than 1/1000000 - not that uncommon.

On the other hand a relative accuracy of say 1.0e-4% would fail if you
have numbers around zero: -0.0000001 could well be approximately 0
or 0.0000001.

I think you will need (at least?) two measures of tolerance:
an absolute value and a relative one ...

Regards,

Arjen

Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Brad-19
In reply to this post by Cariotoglou Mike
> create table test(f double);
> insert into test values(13.04);
> update test set f=f+0.02;
> select * from test where f=13.06; -- returns no data

Using MS SQL Server 2000, substituting the "float" type (with a range
of -1.79E+308 through 1.79E+308) for your double, I get the same
results, i.e. no data returned, because the total ends up being
13.059999999999999 instead of 13.06.

> can you imagine how many bugs waiting to happen are out there, because
> of code like this ?

Quite a few, for programmers who don't understand the nature of
computers and floating point numbers.

> however, these are not applicable to people using SQLITE via wrappers
> that generate their own UUPDATE code.

> pragma floating_accuracy=0.0000001

What about people using MS SQL Server via a wrapper?  Should SQL Server,
and every other database that exhibts this problem, implement your fix?

> Am I the only one that sees the problem? if not, please speak up, and
> maybe we can get a neat solution!

I'm not convinced that a wrapper should be hiding this type of thing
from the user.  It's a part of the underlying DB engine, and the user
needs to be aware of it.  YMMV.

Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Jay Sprenkle
In reply to this post by Cariotoglou Mike
On 12/14/05, Cariotoglou Mike <[hidden email]> wrote:

> I now have a concrete example, which actually happened in an
> installation, and helps to demonstrate the severity of the issue:
>
> try this code:
>
> create table test(f double);
> insert into test values(13.04);
> update test set f=f+0.02;
> select * from test where f=13.06; -- returns no data
>
> can you imagine how many bugs waiting to happen are out there, because
> of code like this ?

I found that same bug when comparing dates that microsoft was storing
as floating point
numbers. It's a problem with the programmer not understanding how
floating point works,
it's not a problem with any particular language.
Reply | Threaded
Open this post in threaded view
|

RE: Problem with floating point fields, and a feature request

Dave Dyer
In reply to this post by Cariotoglou Mike

>>
>>select * from test where f=13.06; -- returns no data

Pardon me for throwing a bomb, but no good programmer
would ever use = to compare floating point numbers.

Choose a more appropriate representation for your data.

Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

John Stanton-3
Dave Dyer wrote:
>>>select * from test where f=13.06; -- returns no data
>
>
> Pardon me for throwing a bomb, but no good programmer
> would ever use = to compare floating point numbers.
>
> Choose a more appropriate representation for your data.
>
It is not a bomb, just something novice programmers have to learn.
There is no = in floating point, it is only possible with integers.
Reply | Threaded
Open this post in threaded view
|

Re[2]: Problem with floating point fields, and a feature request

Teg-3
In reply to this post by Dave Dyer
Hello Dave,

Perhaps I'd have said it's "poor practice" to compare floating point
numbers that way. I think you're right but, the wording changes make an
attack into valid criticism.

C

Wednesday, December 14, 2005, 11:58:11 AM, you wrote:


>>>
>>>select * from test where f=13.06; -- returns no data

DD> Pardon me for throwing a bomb, but no good programmer
DD> would ever use = to compare floating point numbers.

DD> Choose a more appropriate representation for your data.




--
Best regards,
 Teg                            mailto:[hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Problem with floating point fields, and a feature request

Cariotoglou Mike
In reply to this post by John Stanton-3
I see again that you all miss the point. I DO know how to handle floating point. My point is :
 
a. a lot of people will make the error indicated. I am sure that they are poor programmers. I am also sure (judging from some of the questions posted in this list), that there is a lot of them...
 
b. some databases (ORACLE is the only one that comes to mind) support this properly, by allowing for fixed point types.
so, a type declared as NUMERIC(10,3), which is ANSI-92, will be handled properly in comparisons. Most other engines will use floating point only, and will fail.So, to the question : " should all databases implement your fix" the answer is YES if they need it.
 
c. the REAL danger with wrappers is this:
 
you are thinking of wrappers that hide the SQLITE api. Ok, I agree that those do not need to handle the problem, as the programmer can do it themselves. but consider:
 
a lot of people out there, I suspect, use SQLITE through a higher-level API. take a look at the ODBC, OLEDB and .NET providers. They allow point-and-click programmers (yes, mum, they do exist, they are probably idiots, but they outnumber us "real" programmers by a factor of 10:1 at least) to use niceties as data binding, data-aware grids and so on. Now all these data providers do not just hide the API, they also add functionality. One very important one, is , to generate SQL to update the database when a data-aware control changes. The generated SQL is never under the control of the programmer, so he cannot do anything about floating point comparisons or anything else. In particular, generated UPDATE statements (and DELETE, for that matter), will add a WHERE clause, where ALL fields will be compared to their "original" value. why is this ? In order to achieve optimistic row locking, in other words to detect changes by other users in a multi-user environment. now THIS is the situation I am talking about, and there is simply no fix for this problem, unless the db engine itself can handle it.
 
Of course, one can avoid the problem by not using floats, which also implies not using DATES since a lot of programming environments and DB apis use floats for dates (OLEDB does, Delphi does, Visual Basic does etc etc). But this will seriously reduce the usability of SQLITE with RAD environments, which try to hide the DB access from the programmer.
 
Again, I am not advocating that point-and-click programming is good, or that RAD environments are good, or that data-aware controls are a cool idea. I am aware of the issues with all these, having used them for decades. What I am saying is that there a lot of poor sods out there that dont know any better, so they DO use these things, and they are in for a big surprise...
 
since a solution to this issue is fairly simple, and the applicable audience is large, why not provide one?
the fact that MSSQL will not be able to do the same is not an argument that has stopped drh before, has it.
 
 
 
 
 
 
 
 

________________________________

From: John Stanton [mailto:[hidden email]]
Sent: Wed 14-Dec-05 7:35 PM
To: [hidden email]
Subject: Re: [sqlite] Problem with floating point fields, and a feature request



Dave Dyer wrote:
>>>select * from test where f=13.06; -- returns no data
>
>
> Pardon me for throwing a bomb, but no good programmer
> would ever use = to compare floating point numbers.
>
> Choose a more appropriate representation for your data.
>
It is not a bomb, just something novice programmers have to learn.
There is no = in floating point, it is only possible with integers.



Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Arjen Markus
Cariotoglou Mike wrote:
>
> I see again that you all miss the point. I DO know how to handle floating point. My point is :
>
...
>
> since a solution to this issue is fairly simple, and the applicable audience is large, why not provide one?
> the fact that MSSQL will not be able to do the same is not an argument that has stopped drh before, has it.
>

Now you are missing the point :). The solution is not all that simple.
Perhaps the "best" approach is
to allow for "fuzzy" comparisons. I refer you to
http://ftp.cac.psu.edu/pub/ger/fortran/hdk/eps.f90 (or .for)
for more information and an implementation of an algorithm that has
proven its usefulness.
(Don't take it too lightly: according to the author/transcriber it is
the result of long discussions and
disputes in at least one language community).

Your proposal could be used to set the tolerance for the fuzzy
comparison. Scaling issues are then
taken care of by that algorithm (I should have thought of it before....)

Regards,

Arjen

Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Dave Dyer

 A simple "fuzzy" tolerance will not solve the problem, only mask
it for the most common cases.  A single floating point addition
of two numbers can result in ZERO bits of precision.

Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

John Stanton-3
In reply to this post by Cariotoglou Mike
We have previously discussed the issue of fixed point.  It is my
contention that Sqlite would be much enhanced by having a fixed point
numeric type using ASCII decimal digits in display format.  It would be
very much in keeping with the general philosophy of avoiding strict
typing and would prevent users from trying to use floating point
incorrectly.

We have to use TEXT type to store money amounts and perform arithmetic
in the application in the absence of an Sqlite fixed point type to avoid
the amateurish errors introduced by trying to use FP.

The lack of a fixed point type in Sqlite is the issue and expecting
floating point to somehow not be floating point because of the lack of
fixed point is specious.

I do agree with your concern about wrappers.  If they are conceived in
such a way as to blind the user to what is really happening then they
are a menace.  It is an old adage that every problem in CS can be solved
by yet another level of indirection, but that does not mean that
additional levels of indirection will always solve the problem - they
can so easily be the problem.

Cariotoglou Mike wrote:

> I see again that you all miss the point. I DO know how to handle floating point. My point is :
>  
> a. a lot of people will make the error indicated. I am sure that they are poor programmers. I am also sure (judging from some of the questions posted in this list), that there is a lot of them...
>  
> b. some databases (ORACLE is the only one that comes to mind) support this properly, by allowing for fixed point types.
> so, a type declared as NUMERIC(10,3), which is ANSI-92, will be handled properly in comparisons. Most other engines will use floating point only, and will fail.So, to the question : " should all databases implement your fix" the answer is YES if they need it.
>  
> c. the REAL danger with wrappers is this:
>  
> you are thinking of wrappers that hide the SQLITE api. Ok, I agree that those do not need to handle the problem, as the programmer can do it themselves. but consider:
>  
> a lot of people out there, I suspect, use SQLITE through a higher-level API. take a look at the ODBC, OLEDB and .NET providers. They allow point-and-click programmers (yes, mum, they do exist, they are probably idiots, but they outnumber us "real" programmers by a factor of 10:1 at least) to use niceties as data binding, data-aware grids and so on. Now all these data providers do not just hide the API, they also add functionality. One very important one, is , to generate SQL to update the database when a data-aware control changes. The generated SQL is never under the control of the programmer, so he cannot do anything about floating point comparisons or anything else. In particular, generated UPDATE statements (and DELETE, for that matter), will add a WHERE clause, where ALL fields will be compared to their "original" value. why is this ? In order to achieve optimistic row locking, in other words to detect changes by other users in a multi-user environment. now THIS is th
e situation I am talking about, and there is simply no fix for this problem, unless the db engine itself can handle it.

>  
> Of course, one can avoid the problem by not using floats, which also implies not using DATES since a lot of programming environments and DB apis use floats for dates (OLEDB does, Delphi does, Visual Basic does etc etc). But this will seriously reduce the usability of SQLITE with RAD environments, which try to hide the DB access from the programmer.
>  
> Again, I am not advocating that point-and-click programming is good, or that RAD environments are good, or that data-aware controls are a cool idea. I am aware of the issues with all these, having used them for decades. What I am saying is that there a lot of poor sods out there that dont know any better, so they DO use these things, and they are in for a big surprise...
>  
> since a solution to this issue is fairly simple, and the applicable audience is large, why not provide one?
> the fact that MSSQL will not be able to do the same is not an argument that has stopped drh before, has it.
> ________________________________
>
> From: John Stanton [mailto:[hidden email]]
> Sent: Wed 14-Dec-05 7:35 PM
> To: [hidden email]
> Subject: Re: [sqlite] Problem with floating point fields, and a feature request
>
>
>
> Dave Dyer wrote:
>
>>>>select * from test where f=13.06; -- returns no data
>>
>>
>>Pardon me for throwing a bomb, but no good programmer
>>would ever use = to compare floating point numbers.
>>
>>Choose a more appropriate representation for your data.
>>
>
> It is not a bomb, just something novice programmers have to learn.
> There is no = in floating point, it is only possible with integers.
>
>
>