Problem with floating point fields, and a feature request

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with floating point fields, and a feature request

Cariotoglou Mike
There is an issue with floating point fields, which exists in every
database I have dealt with, and of course exists in sqlite as well, all
versions.
 Essentially, the issue is this:

"When are two floating point values considered equal ?"

Why is this an issue ?
Floating point values are some times used as part of keys. Even worse,
sql resolvers (engines that construct sql statements based on old/new
values of client data) frequently construst statements like this one:

update table
 set a=<some float>
where b=<some float>

such a statement will result from a user selecting <a>, editing, and
posting an update. Sqlite needs to find the relevant row, (either with
an index or not), which at some point will involve comparing
a float value stored in the database with a user-supplied float value.

Now, depending on how floats are stored, and how they are inputted, this
may or may not be an issue. in sqlite 2, where all is stored as text,
one could choose to store floats with a known precision,
and work around the inherently inaccurate comparison of floats. in
sqlite 3, we have a FLOAT storage class , which means that we now store
floats in binary form (which is good), and also means that
equality is based on comparing floats in a binary way ( all 8 bytes
being equal, in other words).

This can/will be disastrous in many cases, one such was raised some time
ago by another user (the '9.95' issue). The above update can fail, for
no reason apparent to the user.

It also happens with people working in Delphi, and using native date
formats, which are actually floats. Dates are very often part of keys,
sorting, grouping etc, so all kinds of rounding errors can and will
manifest.

Of course, one can work around this problem, by NOT storing floats in
binary format, but in a well-defined text format. however, this would be
a pity, as a lot of time would be taken by sqlite and user code in
comverting back and forth, plus, in some cases, sqlite will try to
promote values stored as text to binary, re-introducing the problem. not
to mention that, floats stored as text will collate incorrectly, unless
they are right-justified and padded with '0', which increases the
storage requirements.

This is not new, however, in sqlite we *do* have access to the people
that write the code, so a fix *can* be found, if there is a will and
understanding of the problem :) :)

All possible solutions to this issue have to do with the way that float
equality is established. It boils down to this:

two floats A and B should be compared with this algorithm :

diff=A-B
if (diff>tolerance) then A>B
else if (diff<-tolerance) then A<B
else A = B

in other words, a tolerance value should be used to compare them. What
happens with "standard" float arithmetic is that this tolerance is 0,
which leads to the wrong decision, when small rounding errors creep into
the numbers. The question then is, how does one define the tolerance.

The ideal way would be to define it at the column level, as a column
attribute. however, this would require extensive changes in the sqlite
code.

so, we could define it at the global level (not ideal, but it would do),
by :

1. adding a PRAGMA FLOATING_TOLERANCE=<something>

or

adding a user-defined callback (as you did with the REGEXP operator),
such as

sqlite_compareFloat(a,b), which defaults to whatever code is now used to
compare two floats, but can be overriden by the user.

the <something> should be in terms of SIGNIFICANT digits, not number of
decimals, due to the way floating point works.

Having a capability like this would be of immence value to the stability
of database applications, and , I suspect, would not require a major
re-work.

what do you think, DRH, is this doable ?



Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Jay Sprenkle
On 9/20/05, Cariotoglou Mike <[hidden email]> wrote:
>
> There is an issue with floating point fields, which exists in every
> database I have dealt with, and of course exists in sqlite as well, all
> versions.
> Essentially, the issue is this:
>
> "When are two floating point values considered equal ?"


what's wrong with using round(n,2)?


---
The Castles of Dereth Calendar: a tour of the art and architecture of
Asheron's Call
http://www.lulu.com/content/77264
Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Dan Kennedy
In reply to this post by Cariotoglou Mike

> two floats A and B should be compared with this algorithm :
>
> diff=A-B
> if (diff>tolerance) then A>B
> else if (diff<-tolerance) then A<B
> else A = B

You could define a new collation sequence to do all that. However, it's
difficult to say what will happen when you have three numbers A, B and
C such that A==B and B==C but A!=C. This will cause the occasional oddity.

But if you know for some reason this will never happen, everything will
work fine. "Know"....





               
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

RE: Problem with floating point fields, and a feature request

Cariotoglou Mike
In reply to this post by Cariotoglou Mike
collating sequences do not apply to floating point comparisons, do they
?

> -----Original Message-----
> From: Dan Kennedy [mailto:[hidden email]]
> Sent: Tuesday, September 20, 2005 5:12 PM
> To: [hidden email]
> Subject: Re: [sqlite] Problem with floating point fields, and
> a feature request
>
>
> > two floats A and B should be compared with this algorithm :
> >
> > diff=A-B
> > if (diff>tolerance) then A>B
> > else if (diff<-tolerance) then A<B
> > else A = B
>
> You could define a new collation sequence to do all that.
> However, it's difficult to say what will happen when you have
> three numbers A, B and C such that A==B and B==C but A!=C.
> This will cause the occasional oddity.
>
> But if you know for some reason this will never happen,
> everything will work fine. "Know"....
>
>
>
>
>
>
> __________________________________
> Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com
>
>
>

Reply | Threaded
Open this post in threaded view
|

RE: Problem with floating point fields, and a feature request

Cariotoglou Mike
In reply to this post by Cariotoglou Mike
nothing, when you hand-code. everything, when the code is
auto-generated, which very frequently it is

> -----Original Message-----
> From: Jay Sprenkle [mailto:[hidden email]]
> Sent: Tuesday, September 20, 2005 5:04 PM
> To: [hidden email]
> Subject: Re: [sqlite] Problem with floating point fields, and
> a feature request
>
> On 9/20/05, Cariotoglou Mike <[hidden email]> wrote:
> >
> > There is an issue with floating point fields, which exists in every
> > database I have dealt with, and of course exists in sqlite as well,
> > all versions.
> > Essentially, the issue is this:
> >
> > "When are two floating point values considered equal ?"
>
>
> what's wrong with using round(n,2)?
>
>
> ---
> The Castles of Dereth Calendar: a tour of the art and
> architecture of Asheron's Call
> http://www.lulu.com/content/77264
>
>

Reply | Threaded
Open this post in threaded view
|

RE: Problem with floating point fields, and a feature request

Cariotoglou Mike
In reply to this post by Cariotoglou Mike
also, it kills the usage of indexes, whereas what I propose would not.

> -----Original Message-----
> From: Jay Sprenkle [mailto:[hidden email]]
> Sent: Tuesday, September 20, 2005 5:04 PM
> To: [hidden email]
> Subject: Re: [sqlite] Problem with floating point fields, and
> a feature request
>
> On 9/20/05, Cariotoglou Mike <[hidden email]> wrote:
> >
> > There is an issue with floating point fields, which exists in every
> > database I have dealt with, and of course exists in sqlite as well,
> > all versions.
> > Essentially, the issue is this:
> >
> > "When are two floating point values considered equal ?"
>
>
> what's wrong with using round(n,2)?
>
>
> ---
> The Castles of Dereth Calendar: a tour of the art and
> architecture of Asheron's Call
> http://www.lulu.com/content/77264
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Problem with floating point fields, and a feature request

Jay Sprenkle
In reply to this post by Cariotoglou Mike
On 9/20/05, Cariotoglou Mike <[hidden email]> wrote:
>
> nothing, when you hand-code. everything, when the code is
> auto-generated, which very frequently it is



Automatically generated bad code is still bad code.
If it's generating floating point comparisons now it's bad code.
It only works most of the time.
Reply | Threaded
Open this post in threaded view
|

Share an sqlite3 struct between multiple threads

Marco Bambini
I know based on http://www.sqlite.org/faq.html#q8 that it is not safe  
to share the same sqlite3 structure between multiple threads.
But what if I protect its access with a mutex?
It is safe?

---
Marco Bambini

Reply | Threaded
Open this post in threaded view
|

RE: Problem with floating point fields, and a feature request

Dan Kennedy
In reply to this post by Cariotoglou Mike

--- Cariotoglou Mike <[hidden email]> wrote:

> collating sequences do not apply to floating point comparisons, do they?

Hmmm, excellent point. Guess I didn't think too hard about that
one.

But the other point is the real show-stopper, how should the software deal
with the circumstances where A==B, B==C and A!=C?
 

> > > two floats A and B should be compared with this algorithm :
> > >
> > > diff=A-B
> > > if (diff>tolerance) then A>B
> > > else if (diff<-tolerance) then A<B
> > > else A = B
> >
> > You could define a new collation sequence to do all that.
> > However, it's difficult to say what will happen when you have
> > three numbers A, B and C such that A==B and B==C but A!=C.
> > This will cause the occasional oddity.
> >
> > But if you know for some reason this will never happen,
> > everything will work fine. "Know"....



               
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Re: Share an sqlite3 struct between multiple threads

Christian Smith
In reply to this post by Marco Bambini
On Tue, 20 Sep 2005, Marco Bambini wrote:

>I know based on http://www.sqlite.org/faq.html#q8 that it is not safe
>to share the same sqlite3 structure between multiple threads.
>But what if I protect its access with a mutex?
>It is safe?


Until recently, by chance, yes, on some platforms. Recent versions of
SQLite explicitly disallow it and return an error if tried.

Christian

--
    /"\
    \ /    ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
     X                           - AGAINST MS ATTACHMENTS
    / \
Reply | Threaded
Open this post in threaded view
|

Re: Share an sqlite3 struct between multiple threads

Guillaume Fougnies
So SQLite is not "portable" on some Unix (more precisely on
recent linux NPTL where file locks seems broken).

i submitted a patch this month:
http://www.sqlite.org/cvstrac/tktview?tn=1417

Best regards,

Wed, Sep 21, 2005 at 02:56:48PM +0100: Christian Smith wrote:

> On Tue, 20 Sep 2005, Marco Bambini wrote:
>
> >I know based on http://www.sqlite.org/faq.html#q8 that it is not safe
> >to share the same sqlite3 structure between multiple threads.
> >But what if I protect its access with a mutex?
> >It is safe?
>
>
> Until recently, by chance, yes, on some platforms. Recent versions of
> SQLite explicitly disallow it and return an error if tried.
>
> Christian

--
Guillaume FOUGNIES
Eulerian Technologies
Reply | Threaded
Open this post in threaded view
|

Re: Share an sqlite3 struct between multiple threads

Christian Smith
On Wed, 21 Sep 2005, Guillaume Fougnies wrote:

>So SQLite is not "portable" on some Unix (more precisely on
>recent linux NPTL where file locks seems broken).


SQLite is indeed portable. It works round lock problems as best it can
with the current model. It is the POSIX locking implementations that are
not portable.

IMHO, SQLite should, however, only open a single file per database (based
on inode) which should allow threads to override each others locks as the
locks will be on a single file. Hey, if I have the time, I might have a
crack at it. But I've said things like this before, so don't hold your
breath.


>
>i submitted a patch this month:
>http://www.sqlite.org/cvstrac/tktview?tn=1417
>
>Best regards,
>
>Wed, Sep 21, 2005 at 02:56:48PM +0100: Christian Smith wrote:
>> On Tue, 20 Sep 2005, Marco Bambini wrote:
>>
>> >I know based on http://www.sqlite.org/faq.html#q8 that it is not safe
>> >to share the same sqlite3 structure between multiple threads.
>> >But what if I protect its access with a mutex?
>> >It is safe?
>>
>>
>> Until recently, by chance, yes, on some platforms. Recent versions of
>> SQLite explicitly disallow it and return an error if tried.
>>
>> Christian
>
>--
>Guillaume FOUGNIES
>Eulerian Technologies
>

--
    /"\
    \ /    ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
     X                           - AGAINST MS ATTACHMENTS
    / \
Reply | Threaded
Open this post in threaded view
|

Re: Share an sqlite3 struct between multiple threads

Florian Weimer
* Christian Smith:

> IMHO, SQLite should, however, only open a single file per database (based
> on inode) which should allow threads to override each others locks as the
> locks will be on a single file.

I think you need multiple file descriptors, otherwise you'd have to
use pread for accessing pages (or wrap each lseek/read call in a
mutex).

Apart from that, some kind of per-process table which stores the
device/inode pair and the lock status information should do the trick.
The sqlite3 struct would contain an index into that table.  Each time
a file lock has to be changed, the thread acquires a global mutex, and
updates the information in the global lock table.  If necessary, POSIX
file locks are acquired or released.

Of course, you can no longer access the same SQLite3 database using
two sqlite3 objects in the same thread without risking a deadlock, but
I don't think this is a major problem.
Reply | Threaded
Open this post in threaded view
|

Re: Share an sqlite3 struct between multiple threads

Christian Smith
On Thu, 13 Oct 2005, Florian Weimer wrote:

>* Christian Smith:
>
>> IMHO, SQLite should, however, only open a single file per database (based
>> on inode) which should allow threads to override each others locks as the
>> locks will be on a single file.
>
>I think you need multiple file descriptors, otherwise you'd have to
>use pread for accessing pages (or wrap each lseek/read call in a
>mutex).


It's locks on multiple file descriptors to the same file that is causing
the problem. Using pread or locked lseek/read should be satisfactory. So
long as we retain the semantics of the current OsFile* methods, we should
be dandy.


>
>Apart from that, some kind of per-process table which stores the
>device/inode pair and the lock status information should do the trick.
>The sqlite3 struct would contain an index into that table.  Each time
>a file lock has to be changed, the thread acquires a global mutex, and
>updates the information in the global lock table.  If necessary, POSIX
>file locks are acquired or released.


We already have an abstract pointer, the OsFile*. The sqlite3 structure
should require no changes. OsFile can contain all the required references
to actual file descriptors and locking.


>
>Of course, you can no longer access the same SQLite3 database using
>two sqlite3 objects in the same thread without risking a deadlock, but
>I don't think this is a major problem.
>


There should be no problem assuming the OsFile semantics don't appear to
change.

Christian



--
    /"\
    \ /    ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
     X                           - AGAINST MS ATTACHMENTS
    / \