Quantcast

Why is this so much more efficient?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Why is this so much more efficient?

Cecil Westerhof-5
I have a table vmstat that I use to store vmstat info. ;-)
At the moment it has more as 661 thousand records.

In principle the values of usertime, systemtime, idletime, waittime and
stolentime should add up to 100. I just wanted to check it. Of-course there
could be a rounding error, so I wrote the following query:
SELECT date
,      time
,      usertime
,      systemtime
,      idletime
,      waittime
,      stolentime
,      (usertime + systemtime + idletime + waittime + stolentime) AS
totaltime
FROM   vmstat
WHERE  totaltime  < 99 OR totaltime > 101

I did not like that, so I rewrote the WHERE to:
WHERE  ABS(100 - totaltime) > 1

The funny thing the second WHERE is more efficient as the first, where I
would have expected it to be the other way around.
The first takes around 1.050 milliseconds.
The second takes around  950 milliseconds.
So the second is around 10% more efficient. Why is this?

​In case it is important: I did this in sqlitebrowser 3.7.0​, which uses
SQLite 3.8.10.2.

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

R Smith

On 2017/02/15 12:33 PM, Cecil Westerhof wrote:

> I have a table vmstat that I use to store vmstat info. ;-)
> At the moment it has more as 661 thousand records.
>
> In principle the values of usertime, systemtime, idletime, waittime and
> stolentime should add up to 100. I just wanted to check it. Of-course there
> could be a rounding error, so I wrote the following query:
> SELECT date
> ,      time
> ,      usertime
> ,      systemtime
> ,      idletime
> ,      waittime
> ,      stolentime
> ,      (usertime + systemtime + idletime + waittime + stolentime) AS
> totaltime
> FROM   vmstat
> WHERE  totaltime  < 99 OR totaltime > 101
>
> I did not like that, so I rewrote the WHERE to:
> WHERE  ABS(100 - totaltime) > 1
>
> The funny thing the second WHERE is more efficient as the first, where I
> would have expected it to be the other way around.
> The first takes around 1.050 milliseconds.
> The second takes around  950 milliseconds.
> So the second is around 10% more efficient. Why is this?

That's because the first one executes 2 checks mostly - it first checks
to see if totaltime < 99, if so then it returns true, if not, then a
second comparison has to be done... so 2 comparison functions on many items.

The second check involves a single calculation and comparison - so what
you have deduced is that the "minus" function is slightly more efficient
than occasional extra comparison function.

Note however that this may not be entirely true. The Query might read
data from the disk cache (or several memory caches may be in play)
during the second run. Run each statement many times, and compare
average return times.

Also try this:
... WHERE totaltime NOT BETWEEN 99 AND 101;
or
... WHERE NOT (totaltime BETWEEN 99 AND 101);

Cheers,
Ryan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Cecil Westerhof-5
2017-02-15 12:02 GMT+01:00 R Smith <[hidden email]>:

>
> On 2017/02/15 12:33 PM, Cecil Westerhof wrote:
>
>> I have a table vmstat that I use to store vmstat info. ;-)
>> At the moment it has more as 661 thousand records.
>>
>> In principle the values of usertime, systemtime, idletime, waittime and
>> stolentime should add up to 100. I just wanted to check it. Of-course
>> there
>> could be a rounding error, so I wrote the following query:
>> SELECT date
>> ,      time
>> ,      usertime
>> ,      systemtime
>> ,      idletime
>> ,      waittime
>> ,      stolentime
>> ,      (usertime + systemtime + idletime + waittime + stolentime) AS
>> totaltime
>> FROM   vmstat
>> WHERE  totaltime  < 99 OR totaltime > 101
>>
>> I did not like that, so I rewrote the WHERE to:
>> WHERE  ABS(100 - totaltime) > 1
>>
>> The funny thing the second WHERE is more efficient as the first, where I
>> would have expected it to be the other way around.
>> The first takes around 1.050 milliseconds.
>> The second takes around  950 milliseconds.
>> So the second is around 10% more efficient. Why is this?
>>
>
> That's because the first one executes 2 checks mostly - it first checks to
> see if totaltime < 99, if so then it returns true, if not, then a second
> comparison has to be done... so 2 comparison functions on many items.
>
> The second check involves a single calculation and comparison - so what
> you have deduced is that the "minus" function is slightly more efficient
> than occasional extra comparison function.
>

​Minus function and abs function.​




> Note however that this may not be entirely true. The Query might read data
> from the disk cache (or several memory caches may be in play) during the
> second run. Run each statement many times, and compare average return times.
>

​I did. Not very much. But maybe I should try it a bit more often.​



Also try this:
> ... WHERE totaltime NOT BETWEEN 99 AND 101;
> or
> ... WHERE NOT (totaltime BETWEEN 99 AND 101);
>

​I will try those also.

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Cecil Westerhof-5
In reply to this post by R Smith
2017-02-15 12:02 GMT+01:00 R Smith <[hidden email]>:

> Note however that this may not be entirely true. The Query might read data
> from the disk cache (or several memory caches may be in play) during the
> second run. Run each statement many times, and compare average return times.
>
> Also try this:
> ... WHERE totaltime NOT BETWEEN 99 AND 101;
> or
> ... WHERE NOT (totaltime BETWEEN 99 AND 101);
>

​I wrote the following Bash script:
#!/usr/bin/env bash

set -o errexit
set -o nounset


declare -r COUNT=10
declare -r SELECT='
SELECT date
,      time
,      usertime
,      systemtime
,      idletime
,      waittime
,      stolentime
,      (usertime + systemtime + idletime + waittime + stolentime) AS
totaltime
FROM   vmstat
'

declare -r ABS_VERSION="
${SELECT}
WHERE  ABS(100 - totaltime) > 1
;"
declare -r OR_VERSION="
${SELECT}
WHERE  totaltime  < 99 OR totaltime > 101
;"
declare -r NOT_BETWEEN_VERSION="
${SELECT}
WHERE totaltime NOT BETWEEN 99 AND 101
;"
declare -r NOT_BETWEEN_VERSION2="
${SELECT}
WHERE NOT (totaltime BETWEEN 99 AND 101)
;"


function timeQuery {
    for i in $(seq ${COUNT}) ; do
        sqlite3 ~/Databases/general.sqlite <<EOT
.timer on
.output /dev/null
${1}
EOT
    done
}


printf "Timing OR version\n"
timeQuery "${OR_VERSION}"
printf "\n\n\nTiming ABS version\n"
timeQuery "${ABS_VERSION}"
printf "\n\n\nTiming NOT BETWEEN version\n"
timeQuery "${NOT_BETWEEN_VERSION}"
printf "\n\n\nTiming NOT BETWEEN version 2\n"
timeQuery "${NOT_BETWEEN_VERSION2}"


This gives:
Timing OR version
Run Time: real 0.301 user 0.260000 sys 0.040000
Run Time: real 0.301 user 0.260000 sys 0.040000
Run Time: real 0.295 user 0.268000 sys 0.028000
Run Time: real 0.282 user 0.252000 sys 0.032000
Run Time: real 0.287 user 0.276000 sys 0.012000
Run Time: real 0.278 user 0.240000 sys 0.040000
Run Time: real 0.288 user 0.276000 sys 0.012000
Run Time: real 0.290 user 0.260000 sys 0.028000
Run Time: real 0.285 user 0.244000 sys 0.040000
Run Time: real 0.303 user 0.292000 sys 0.012000



Timing ABS version
Run Time: real 0.256 user 0.224000 sys 0.032000
Run Time: real 0.260 user 0.236000 sys 0.024000
Run Time: real 0.256 user 0.228000 sys 0.028000
Run Time: real 0.259 user 0.240000 sys 0.020000
Run Time: real 0.262 user 0.236000 sys 0.024000
Run Time: real 0.262 user 0.236000 sys 0.024000
Run Time: real 0.260 user 0.220000 sys 0.040000
Run Time: real 0.251 user 0.224000 sys 0.028000
Run Time: real 0.252 user 0.224000 sys 0.028000
Run Time: real 0.257 user 0.224000 sys 0.032000



Timing NOT BETWEEN version
Run Time: real 0.252 user 0.236000 sys 0.016000
Run Time: real 0.242 user 0.216000 sys 0.024000
Run Time: real 0.250 user 0.228000 sys 0.020000
Run Time: real 0.245 user 0.216000 sys 0.028000
Run Time: real 0.240 user 0.196000 sys 0.044000
Run Time: real 0.258 user 0.232000 sys 0.024000
Run Time: real 0.284 user 0.248000 sys 0.036000
Run Time: real 0.253 user 0.228000 sys 0.024000
Run Time: real 0.249 user 0.216000 sys 0.032000
Run Time: real 0.246 user 0.208000 sys 0.036000



Timing NOT BETWEEN version 2
Run Time: real 0.257 user 0.228000 sys 0.028000
Run Time: real 0.264 user 0.240000 sys 0.024000
Run Time: real 0.254 user 0.232000 sys 0.020000
Run Time: real 0.257 user 0.204000 sys 0.052000
Run Time: real 0.248 user 0.212000 sys 0.036000
Run Time: real 0.257 user 0.228000 sys 0.028000
Run Time: real 0.246 user 0.232000 sys 0.012000
Run Time: real 0.245 user 0.228000 sys 0.016000
Run Time: real 0.278 user 0.260000 sys 0.016000
Run Time: real 0.275 user 0.252000 sys 0.024000


So I would say that the OR version is less efficient as the other three.
And that there is not a big difference between those.

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Cecil Westerhof-5
2017-02-15 13:40 GMT+01:00 Cecil Westerhof <[hidden email]>:

> I wrote the following Bash script:
>

​I wrote a better one. See end of post.

Inprinciple you are only interested in the totals. I also changed the runs
from 10 to 25.

The OR version is the least efficient and it look likes the BETWEEN version
2 is the most efficient. It looks like it uses less user and more sys.

The results of two runs:
Timing OR version
real: 7.345
user: 6.688
sys:  0.648


Timing ABS version
real: 6.403
user: 5.56
sys:  0.84


Timing NOT BETWEEN version
real: 6.382
user: 5.62
sys:  0.752


Timing NOT BETWEEN version 2
real: 6.291
user: 5.488
sys:  0.788

and:
Timing OR version
real: 7.228
user: 6.496
sys:  0.728


Timing ABS version
real: 6.647
user: 5.904
sys:  0.74


Timing NOT BETWEEN version
real: 6.431
user: 5.688
sys:  0.736


Timing NOT BETWEEN version 2
real: 6.263
user: 5.492
sys:  0.756


The script:
#!/usr/bin/env bash

set -o errexit
set -o nounset


declare -r COUNT=25
declare -r DISPLAY_ALL=F
declare -r SELECT='
SELECT date
,      time
,      usertime
,      systemtime
,      idletime
,      waittime
,      stolentime
,      (usertime + systemtime + idletime + waittime + stolentime) AS
totaltime
FROM   vmstat
'

declare -r ABS_VERSION="
${SELECT}
WHERE  ABS(100 - totaltime) > 1
;"
declare -r OR_VERSION="
${SELECT}
WHERE  totaltime  < 99 OR totaltime > 101
;"
declare -r NOT_BETWEEN_VERSION="
${SELECT}
WHERE totaltime NOT BETWEEN 99 AND 101
;"
declare -r NOT_BETWEEN_VERSION2="
${SELECT}
WHERE NOT (totaltime BETWEEN 99 AND 101)
;"


function timeQuery {
    for i in $(seq ${COUNT}) ; do
        sqlite3 ~/Databases/general.sqlite <<EOT
.timer on
.output /dev/null
${1}
EOT
    done
}

function totalTimes {
    declare output=$(timeQuery "${1}")

    if [[ ${DISPLAY_ALL} == 'T' ]] ; then
        printf "${output}\n"
    fi
    printf "${output}" | awk '
        END {
            print "real: " real
            print "user: " user
            print "sys:  " sys
        }

        {
            real += $4
            user += $6
            sys  += $8
        }
    '
}


printf "Timing OR version\n"
totalTimes "${OR_VERSION}"
printf "\n\nTiming ABS version\n"
totalTimes "${ABS_VERSION}"
printf "\n\nTiming NOT BETWEEN version\n"
totalTimes "${NOT_BETWEEN_VERSION}"
printf "\n\nTiming NOT BETWEEN version 2\n"
totalTimes "${NOT_BETWEEN_VERSION2}"

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Cecil Westerhof-5
2017-02-15 14:18 GMT+01:00 Cecil Westerhof <[hidden email]>:

> The OR version is the least efficient and it look likes the BETWEEN
> version 2 is the most efficient. It looks like it uses less user and more
> sys.
>

​Which is the most efficient is also dependent on the state of the database
itself. I compacted the database and now sometimes NOT BETWEEN is more
efficient and sometimes NOT BETWEEN version 2 is more efficient.
The OR version is always the least efficient.

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

R Smith
Basically, the DB size and compactness state influences the speed at
which values are read. (Every update implies a read).

Add to that the fact that these functions verge on the bottom edge of
time consumers... It's like testing diffusion speed of a perfume in a
hurricane.

Glad you found the close competition between most-optimal solutions.


On 2017/02/15 8:43 PM, Cecil Westerhof wrote:

> 2017-02-15 14:18 GMT+01:00 Cecil Westerhof <[hidden email]>:
>
>> The OR version is the least efficient and it look likes the BETWEEN
>> version 2 is the most efficient. It looks like it uses less user and more
>> sys.
>>
> ​Which is the most efficient is also dependent on the state of the database
> itself. I compacted the database and now sometimes NOT BETWEEN is more
> efficient and sometimes NOT BETWEEN version 2 is more efficient.
> The OR version is always the least efficient.
>

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Dominique Pellé
In reply to this post by Cecil Westerhof-5
Cecil Westerhof <[hidden email]> wrote:

> I have a table vmstat that I use to store vmstat info. ;-)
> At the moment it has more as 661 thousand records.
>
> In principle the values of usertime, systemtime, idletime, waittime and
> stolentime should add up to 100. I just wanted to check it. Of-course there
> could be a rounding error, so I wrote the following query:
> SELECT date
> ,      time
> ,      usertime
> ,      systemtime
> ,      idletime
> ,      waittime
> ,      stolentime
> ,      (usertime + systemtime + idletime + waittime + stolentime) AS
> totaltime
> FROM   vmstat
> WHERE  totaltime  < 99 OR totaltime > 101
>
> I did not like that, so I rewrote the WHERE to:
> WHERE  ABS(100 - totaltime) > 1
>
> The funny thing the second WHERE is more efficient as the first, where I
> would have expected it to be the other way around.
> The first takes around 1.050 milliseconds.
> The second takes around  950 milliseconds.
> So the second is around 10% more efficient. Why is this?
>
> In case it is important: I did this in sqlitebrowser 3.7.0, which uses
> SQLite 3.8.10.2.


I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
SQLite cannot use an index since it's an expression, so it does a
full table scan, whereas the other solution which does
"WHERE  totaltime  < 99 OR totaltime > 101"
may use an index on totaltime (assuming that there is an index).

In general using an index is good.  But if most of the records
satisfy the condition "ABS(100 - totaltime) > 1" then an index
can be more harmful than useful.   And that could explain
why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
You could try "EXPLAIN QUERY PLAN" on your queries to
see if they use an index or if they do a full table scan.

Dominique
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Cecil Westerhof-5
2017-02-16 21:10 GMT+01:00 Dominique Pellé <[hidden email]>:

> Cecil Westerhof <[hidden email]> wrote:
>
> > I have a table vmstat that I use to store vmstat info. ;-)
> > At the moment it has more as 661 thousand records.
> >
> > In principle the values of usertime, systemtime, idletime, waittime and
> > stolentime should add up to 100. I just wanted to check it. Of-course
> there
> > could be a rounding error, so I wrote the following query:
> > SELECT date
> > ,      time
> > ,      usertime
> > ,      systemtime
> > ,      idletime
> > ,      waittime
> > ,      stolentime
> > ,      (usertime + systemtime + idletime + waittime + stolentime) AS
> > totaltime
> > FROM   vmstat
> > WHERE  totaltime  < 99 OR totaltime > 101
> >
> > I did not like that, so I rewrote the WHERE to:
> > WHERE  ABS(100 - totaltime) > 1
> >
> > The funny thing the second WHERE is more efficient as the first, where I
> > would have expected it to be the other way around.
> > The first takes around 1.050 milliseconds.
> > The second takes around  950 milliseconds.
> > So the second is around 10% more efficient. Why is this?
> >
> > In case it is important: I did this in sqlitebrowser 3.7.0, which uses
> > SQLite 3.8.10.2.
>
>
> I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
> SQLite cannot use an index since it's an expression, so it does a
> full table scan, whereas the other solution which does
> "WHERE  totaltime  < 99 OR totaltime > 101"
> may use an index on totaltime (assuming that there is an index).
>
> In general using an index is good.  But if most of the records
> satisfy the condition "ABS(100 - totaltime) > 1" then an index
> can be more harmful than useful.   And that could explain
> why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
> You could try "EXPLAIN QUERY PLAN" on your queries to
> see if they use an index or if they do a full table scan.
>

​Totaltime is calculated, so it cannot have an index. ;-)
Besides from the almost 700.000 records only two satisfy the condition.

I should look into EXPLAIN QUERY PLAN.

--
Cecil Westerhof
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Darko Volaric
You can actually index functions or expression:
https://www.sqlite.org/expridx.html

On Thu, Feb 16, 2017 at 9:32 PM, Cecil Westerhof <[hidden email]>
wrote:

> 2017-02-16 21:10 GMT+01:00 Dominique Pellé <[hidden email]>:
>
> > Cecil Westerhof <[hidden email]> wrote:
> >
> > > I have a table vmstat that I use to store vmstat info. ;-)
> > > At the moment it has more as 661 thousand records.
> > >
> > > In principle the values of usertime, systemtime, idletime, waittime and
> > > stolentime should add up to 100. I just wanted to check it. Of-course
> > there
> > > could be a rounding error, so I wrote the following query:
> > > SELECT date
> > > ,      time
> > > ,      usertime
> > > ,      systemtime
> > > ,      idletime
> > > ,      waittime
> > > ,      stolentime
> > > ,      (usertime + systemtime + idletime + waittime + stolentime) AS
> > > totaltime
> > > FROM   vmstat
> > > WHERE  totaltime  < 99 OR totaltime > 101
> > >
> > > I did not like that, so I rewrote the WHERE to:
> > > WHERE  ABS(100 - totaltime) > 1
> > >
> > > The funny thing the second WHERE is more efficient as the first, where
> I
> > > would have expected it to be the other way around.
> > > The first takes around 1.050 milliseconds.
> > > The second takes around  950 milliseconds.
> > > So the second is around 10% more efficient. Why is this?
> > >
> > > In case it is important: I did this in sqlitebrowser 3.7.0, which uses
> > > SQLite 3.8.10.2.
> >
> >
> > I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
> > SQLite cannot use an index since it's an expression, so it does a
> > full table scan, whereas the other solution which does
> > "WHERE  totaltime  < 99 OR totaltime > 101"
> > may use an index on totaltime (assuming that there is an index).
> >
> > In general using an index is good.  But if most of the records
> > satisfy the condition "ABS(100 - totaltime) > 1" then an index
> > can be more harmful than useful.   And that could explain
> > why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
> > You could try "EXPLAIN QUERY PLAN" on your queries to
> > see if they use an index or if they do a full table scan.
> >
>
> ​Totaltime is calculated, so it cannot have an index. ;-)
> Besides from the almost 700.000 records only two satisfy the condition.
>
> I should look into EXPLAIN QUERY PLAN.
>
> --
> Cecil Westerhof
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Why is this so much more efficient?

Barry Smith
Aliases in SQL are not the same as variables in most procedural languages. So every time you mention 'totaltime' SQLite is probably recalculating that value by adding all the columns together.  See the various discussions regarding no deterministic (random) functions last year. Less references to that alias => less calculations. Reference it twice and expect it to be slower.

> On 17 Feb 2017, at 12:27 AM, Darko Volaric <[hidden email]> wrote:
>
> You can actually index functions or expression:
> https://www.sqlite.org/expridx.html
>
> On Thu, Feb 16, 2017 at 9:32 PM, Cecil Westerhof <[hidden email]>
> wrote:
>
>> 2017-02-16 21:10 GMT+01:00 Dominique Pellé <[hidden email]>:
>>
>>> Cecil Westerhof <[hidden email]> wrote:
>>>
>>>> I have a table vmstat that I use to store vmstat info. ;-)
>>>> At the moment it has more as 661 thousand records.
>>>>
>>>> In principle the values of usertime, systemtime, idletime, waittime and
>>>> stolentime should add up to 100. I just wanted to check it. Of-course
>>> there
>>>> could be a rounding error, so I wrote the following query:
>>>> SELECT date
>>>> ,      time
>>>> ,      usertime
>>>> ,      systemtime
>>>> ,      idletime
>>>> ,      waittime
>>>> ,      stolentime
>>>> ,      (usertime + systemtime + idletime + waittime + stolentime) AS
>>>> totaltime
>>>> FROM   vmstat
>>>> WHERE  totaltime  < 99 OR totaltime > 101
>>>>
>>>> I did not like that, so I rewrote the WHERE to:
>>>> WHERE  ABS(100 - totaltime) > 1
>>>>
>>>> The funny thing the second WHERE is more efficient as the first, where
>> I
>>>> would have expected it to be the other way around.
>>>> The first takes around 1.050 milliseconds.
>>>> The second takes around  950 milliseconds.
>>>> So the second is around 10% more efficient. Why is this?
>>>>
>>>> In case it is important: I did this in sqlitebrowser 3.7.0, which uses
>>>> SQLite 3.8.10.2.
>>>
>>>
>>> I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
>>> SQLite cannot use an index since it's an expression, so it does a
>>> full table scan, whereas the other solution which does
>>> "WHERE  totaltime  < 99 OR totaltime > 101"
>>> may use an index on totaltime (assuming that there is an index).
>>>
>>> In general using an index is good.  But if most of the records
>>> satisfy the condition "ABS(100 - totaltime) > 1" then an index
>>> can be more harmful than useful.   And that could explain
>>> why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
>>> You could try "EXPLAIN QUERY PLAN" on your queries to
>>> see if they use an index or if they do a full table scan.
>>
>> ​Totaltime is calculated, so it cannot have an index. ;-)
>> Besides from the almost 700.000 records only two satisfy the condition.
>>
>> I should look into EXPLAIN QUERY PLAN.
>>
>> --
>> Cecil Westerhof
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Loading...