Yet Another Why Doesn't Sqlite Use My Index question ...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Yet Another Why Doesn't Sqlite Use My Index question ...

Paul Sanderson
Oops - somehow misread the last message - 54 seconds down from 32 minutes -
that's a result!

Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 19 March 2017 at 12:07, Paul Sanderson <[hidden email]>
wrote:

> What is the average size of the text in the direction field? and what page
> size have you set for the database? If the size of a record is such that
> only a small handful fit into a page, or worse each record overflows (and
> your select includes the direction field) then this could impact
> performance.
>
> Paul
> www.sandersonforensics.com
> skype: r3scue193
> twitter: @sandersonforens
> Tel +44 (0)1326 572786 <+44%201326%20572786>
> http://sandersonforensics.com/forum/content.php?195-SQLite-
> Forensic-Toolkit -Forensic Toolkit for SQLite
> email from a work address for a fully functional demo licence
>
> On 18 March 2017 at 05:48, Rob Willett <[hidden email]>
> wrote:
>
>> We've just implemented a covering index for the last step (is it really?)
>> in our quest to get the query execution time down.
>>
>> To summarise we have gone from 32 mins to 16 mins by updating an index so
>> it doesn't use collate, we took another six minutes off by removing extra
>> fields in the select we didn't need.
>>
>> We have just created a new index which 'covers' all the fields we use in
>> the select, this means (and I paraphrase) that we use the index to get all
>> the data and there is no need to read from the database.
>>
>> Well that was a bit of a surprise, the index creation took 45 mins, we
>> ran the program again and thought, rats, we've cocked it up, it only took
>> 54 secs, we got something wrong. So we checked it and checked again and we
>> hasn't got anything wrong. Our query has moved from 32 mins to 54 secs.
>>
>> We're quite happy with that performance increase. In fact we're
>> delighted, so thanks for all the help in getting us to this stage.
>>
>> We have kept copies of the query planner bytecode output if anybody is
>> interested. Gunter has had copies, but if anybody else would like them,
>> please ask.
>>
>> Many thanks again for all the help,
>>
>> Rob
>>
>>
>> On 17 Mar 2017, at 22:12, Rob Willett wrote:
>>
>> Dear all,
>>>
>>> We progress steadily forward.
>>>
>>> 1. We immediately halved our execution time by moving to an updated
>>> index that doesn't use COLLATE. Thats 32mins to 16mins.
>>>
>>> 2. We then shaved a further six minutes off the execution time by
>>> removing extraneous fields in the select statement, so instead of "select *
>>> ...", we identified which fields we used and directly selected those. So we
>>> are now down to 10 mins or 1/3 of when we started for, to be honest,
>>> virtually no extra work, merely being smarter, or rather you being smarter.
>>>
>>> 3. We have looked through all our indexes and can see that every index
>>> has a COLLATE against it, even if the column is an integer. We have raised
>>> a support call with Navicat.
>>>
>>> 4. The next step is to create a "covering index" to try and get the
>>> whole of the query into the index. However its 22:11 in London and I need
>>> to get home.
>>>
>>> Thanks very much for the help so far. Tomorrow is more tricky but I'll
>>> read up on covering indexes to see how to use them,.
>>>
>>> Rob
>>>
>>> On 17 Mar 2017, at 18:39, Simon Slavin wrote:
>>>
>>> On 17 Mar 2017, at 6:22pm, Rob Willett <[hidden email]>
>>>> wrote:
>>>>
>>>> 4. Work through returning just the columns we actually need from our
>>>>> queries. We have a recollection that if we can build an index with all the
>>>>> information necessary in it, we can do all the work in joins rather than
>>>>> paging out to disk. Is this what you are referring to?
>>>>>
>>>>
>>>> It works only where all the columns you need to read are in the same
>>>> table.  The ideal form of a covering index is to have the columns listed in
>>>> this order:
>>>>
>>>> 1) columns needed for the WHERE clause
>>>> 2) columns needed for the ORDER BY clause which aren’t in (1)
>>>> 3) columns needed to be read which aren’t in (2) or (1)
>>>>
>>>> SQLite detects that all the information it needs for the SELECT is
>>>> available from the index, so it doesn’t bother to read the table at all.
>>>> This can lead to something like a doubling of speed.  Of course, you
>>>> sacrifice filespace, and making changes to the table takes a little longer.
>>>>
>>>> 5. Sleep (not exactly sure when) and watch three international rugby
>>>>> games tomorrow.
>>>>>
>>>>
>>>> Sleep while waiting for indexes to be created and ANALYZE to work.  May
>>>> you see skilled players, creative moves and dramatic play.
>>>>
>>>> Simon.
>>>> _______________________________________________
>>>> sqlite-users mailing list
>>>> [hidden email]
>>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [hidden email]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Yet Another Why Doesn't Sqlite Use My Index question ...

David Raymond
In reply to this post by Rob Willett
"... Our query has moved from 32 mins to 54 secs.

We're quite happy with that performance increase."

I have to admit that the hearty chuckle which that statement produced from me hurt my sore throat. Totally worth it though.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Yet Another Why Doesn't Sqlite Use My Index question ...

Rob Willett
David,

We're English and masters of the understatement ;)

Rob

On 20 Mar 2017, at 15:04, David Raymond wrote:

> "... Our query has moved from 32 mins to 54 secs.
>
> We're quite happy with that performance increase."
>
> I have to admit that the hearty chuckle which that statement produced
> from me hurt my sore throat. Totally worth it though.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Yet Another Why Doesn't Sqlite Use My Index question ...

Rob Willett
In reply to this post by Rob Willett
Just to close this thread down completely.

We've implemented the changes on our production server and the actual
performance increase is even better than we predicted.

Our production server is a containerised Ubuntu Server. Its hardware
configuration is significantly different to our Macs on which we
develop. Our Macs have more RAM and SSD disks. The production server has
other advantages though :)

The query took 90 mins on our production server, this is mainly due to
slower disks than we have locally, so this was not unexpected.

After doing the performance changes (removing unneeded fields and adding
in a covering index), we went down to 38 secs.

This has meant we are no longer looking at a second database just for
analytics, we just needed to learn to use the database we already had :)

Rob

On 18 Mar 2017, at 5:48, Rob Willett wrote:

> We've just implemented a covering index for the last step (is it
> really?) in our quest to get the query execution time down.
>
> To summarise we have gone from 32 mins to 16 mins by updating an index
> so it doesn't use collate, we took another six minutes off by removing
> extra fields in the select we didn't need.
>
> We have just created a new index which 'covers' all the fields we use
> in the select, this means (and I paraphrase) that we use the index to
> get all the data and there is no need to read from the database.
>
> Well that was a bit of a surprise, the index creation took 45 mins, we
> ran the program again and thought, rats, we've cocked it up, it only
> took 54 secs, we got something wrong. So we checked it and checked
> again and we hasn't got anything wrong. Our query has moved from 32
> mins to 54 secs.
>
> We're quite happy with that performance increase. In fact we're
> delighted, so thanks for all the help in getting us to this stage.
>
> We have kept copies of the query planner bytecode output if anybody is
> interested. Gunter has had copies, but if anybody else would like
> them, please ask.
>
> Many thanks again for all the help,
>
> Rob
>
> On 17 Mar 2017, at 22:12, Rob Willett wrote:
>
>> Dear all,
>>
>> We progress steadily forward.
>>
>> 1. We immediately halved our execution time by moving to an updated
>> index that doesn't use COLLATE. Thats 32mins to 16mins.
>>
>> 2. We then shaved a further six minutes off the execution time by
>> removing extraneous fields in the select statement, so instead of
>> "select * ...", we identified which fields we used and directly
>> selected those. So we are now down to 10 mins or 1/3 of when we
>> started for, to be honest, virtually no extra work, merely being
>> smarter, or rather you being smarter.
>>
>> 3. We have looked through all our indexes and can see that every
>> index has a COLLATE against it, even if the column is an integer. We
>> have raised a support call with Navicat.
>>
>> 4. The next step is to create a "covering index" to try and get the
>> whole of the query into the index. However its 22:11 in London and I
>> need to get home.
>>
>> Thanks very much for the help so far. Tomorrow is more tricky but
>> I'll read up on covering indexes to see how to use them,.
>>
>> Rob
>>
>> On 17 Mar 2017, at 18:39, Simon Slavin wrote:
>>
>>> On 17 Mar 2017, at 6:22pm, Rob Willett
>>> <[hidden email]> wrote:
>>>
>>>> 4. Work through returning just the columns we actually need from
>>>> our queries. We have a recollection that if we can build an index
>>>> with all the information necessary in it, we can do all the work in
>>>> joins rather than paging out to disk. Is this what you are
>>>> referring to?
>>>
>>> It works only where all the columns you need to read are in the same
>>> table.  The ideal form of a covering index is to have the columns
>>> listed in this order:
>>>
>>> 1) columns needed for the WHERE clause
>>> 2) columns needed for the ORDER BY clause which aren’t in (1)
>>> 3) columns needed to be read which aren’t in (2) or (1)
>>>
>>> SQLite detects that all the information it needs for the SELECT is
>>> available from the index, so it doesn’t bother to read the table
>>> at all.  This can lead to something like a doubling of speed.  Of
>>> course, you sacrifice filespace, and making changes to the table
>>> takes a little longer.
>>>
>>>> 5. Sleep (not exactly sure when) and watch three international
>>>> rugby games tomorrow.
>>>
>>> Sleep while waiting for indexes to be created and ANALYZE to work.  
>>> May you see skilled players, creative moves and dramatic play.
>>>
>>> Simon.
>>> _______________________________________________
>>> sqlite-users mailing list
>>> [hidden email]
>>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
12