Index and General Optimization Question

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Index and General Optimization Question

John Sample
Hello all.
I'm a new Sqlite user porting a large db for an open source project
from SQL Server.

I have a question regarding whether an index makes sense or will be
used for a certain scenario. Since the table in question is so large
it takes up to 8 hours for me to create new indexes, so I'd like to
see if I can get a best guess before trying.

A little background info:
This is for a GIS system. The Sqlite database is 2+ gigs at the moment.
The table in question has 17,000,000 rows and is performing well so far.
It has a multicolumn index on 4 columns which are always part of the queries.

NAME TYPE DIRP DIRS

There are 2 more columns that make up the rest of the select clause,
but it is an OR statement. A pseudo select clause would look like
this:

NAME='char' and TYPE='char' DIRP='char' DIRS='char' and (ZIPL =int OR ZIPR=int)

Now to the question:
Can/will a multicolumn index (ZIPL,ZIPR) be used for an OR clause?
If so, what if it was part of the other multicolumn index?

Thanks for any guidance!
Reply | Threaded
Open this post in threaded view
|

RE: Index and General Optimization Question

Steve O'Hara

Why don't you create and a very small version of the database, create the
indices and use the EXPLAIN command to see what it does for your queries.

Steve

-----Original Message-----
From: sqlite-users-return-7035-sohara=[hidden email]
[mailto:sqlite-users-return-7035-sohara=[hidden email]
rg]On Behalf Of John Sample
Sent: 12 August 2005 16:18
To: [hidden email]
Subject: [sqlite] Index and General Optimization Question


Hello all.
I'm a new Sqlite user porting a large db for an open source project
from SQL Server.

I have a question regarding whether an index makes sense or will be
used for a certain scenario. Since the table in question is so large
it takes up to 8 hours for me to create new indexes, so I'd like to
see if I can get a best guess before trying.

A little background info:
This is for a GIS system. The Sqlite database is 2+ gigs at the moment.
The table in question has 17,000,000 rows and is performing well so far.
It has a multicolumn index on 4 columns which are always part of the
queries.

NAME TYPE DIRP DIRS

There are 2 more columns that make up the rest of the select clause,
but it is an OR statement. A pseudo select clause would look like
this:

NAME='char' and TYPE='char' DIRP='char' DIRS='char' and (ZIPL =int OR
ZIPR=int)

Now to the question:
Can/will a multicolumn index (ZIPL,ZIPR) be used for an OR clause?
If so, what if it was part of the other multicolumn index?

Thanks for any guidance!




Reply | Threaded
Open this post in threaded view
|

Re: Index and General Optimization Question

D. Richard Hipp
In reply to this post by John Sample
On Fri, 2005-08-12 at 11:18 -0400, John Sample wrote:
> NAME='char' and TYPE='char' DIRP='char' DIRS='char' and (ZIPL =int OR ZIPR=int)
>
> Now to the question:
> Can/will a multicolumn index (ZIPL,ZIPR) be used for an OR clause?

Version 3.2.2 will not optimize the above.  But the latest code
in CVS will.  3.2.3 will probably be released relatively soon.

--
D. Richard Hipp <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: Index and General Optimization Question

Kurt Welgehausen
In reply to this post by John Sample
Currently, indices are not used to look up either term
in an OR expression.

See <http://www.sqlite.org/php2004/page-052.html>.

You could build 2 separate 5-column indices and use a
union.

Regards
Reply | Threaded
Open this post in threaded view
|

Re: Index and General Optimization Question

Khamis Abuelkomboz
Kurt Welgehausen wrote:

>Currently, indices are not used to look up either term
>in an OR expression.
>
>See <http://www.sqlite.org/php2004/page-052.html>.
>
>You could build 2 separate 5-column indices and use a
>union.
>
>Regards
>
his is not true. databases split usually queries into several sub
queres, if you use an OR clause in the middle of indexed fields. I don't
know, if sqlite is doing somelike things.

As example, imagine you have two fields in an index (field1, field2)

select field1, field2 where field1 = 'value' and (field2 = 'value1' or
field2 = 'value2')

the database will start two subqueris and make a unique union of both
results:

select field1, field2 where field1 = 'value' and field2 = 'value1'
select field1, field2 where field1 = 'value' and field2 = 'value2'


--
Try Code-Navigator on http://www.codenav.com
a source code navigating, analysis and developing tool.
It supports following languages:
    * C/C++
    * Java
    * .NET (including CSharp, VB.Net and other .NET components)
    * Classic Visual Basic
    * PHP, HTML, XML, ASP, CSS
    * Tcl/Tk,
    * Perl
    * Python
    * SQL,
    * m4 Preprocessor
    * Cobol

Reply | Threaded
Open this post in threaded view
|

Re: Index and General Optimization Question

Khamis Abuelkomboz
In reply to this post by John Sample
John Sample wrote:

>Hello all.
>I'm a new Sqlite user porting a large db for an open source project
>from SQL Server.
>
>I have a question regarding whether an index makes sense or will be
>used for a certain scenario. Since the table in question is so large
>it takes up to 8 hours for me to create new indexes, so I'd like to
>see if I can get a best guess before trying.
>
>A little background info:
>This is for a GIS system. The Sqlite database is 2+ gigs at the moment.
>The table in question has 17,000,000 rows and is performing well so far.
>It has a multicolumn index on 4 columns which are always part of the queries.
>
>NAME TYPE DIRP DIRS
>
>There are 2 more columns that make up the rest of the select clause,
>but it is an OR statement. A pseudo select clause would look like
>this:
>
>NAME='char' and TYPE='char' DIRP='char' DIRS='char' and (ZIPL =int OR ZIPR=int)
>
>Now to the question:
>Can/will a multicolumn index (ZIPL,ZIPR) be used for an OR clause?
>If so, what if it was part of the other multicolumn index?
>
>Thanks for any guidance!
>
>  
>
A new index only on the fields ZIPL or ZIPR would propably make your
queries even slower, because they are not part of the rest of fields.
Your query could be faster, if there are two many SIMILAR values for the
tuppel (NAME,TYPE,DIRP,DIRS), in this case I would add the following two
indices:

index1: NAME,TYPE,DIRP,DIRS,ZIPL
index2: NAME,TYPE,DIRP,DIRS,ZIPR

khamis

--
Try Code-Navigator on http://www.codenav.com
a source code navigating, analysis and developing tool.
It supports following languages:
    * C/C++
    * Java
    * .NET (including CSharp, VB.Net and other .NET components)
    * Classic Visual Basic
    * PHP, HTML, XML, ASP, CSS
    * Tcl/Tk,
    * Perl
    * Python
    * SQL,
    * m4 Preprocessor
    * Cobol