About Index using

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

About Index using

Bo Lin
 Hi ,

Here is a sql string ,like : select * from test where (a=0 or a=1) and b=1 ;
and  column a range from 1-10000, and  column b range from 0-1.  and DB has about 300,000 record with colum a and b configured randomly .

Two index is create on "test" table . First is on "column b" and the second is on "a,b"

but when I try to use "explain" , I found   the first index is used. but obviousely if sqlite can use the second index , the performance can be improved a lot .

how can I use the second index, can sqlite can support "select" to specify certain index ?

               
---------------------------------
 雅虎免费G邮箱-中国第一绝无垃圾邮件骚扰超大邮箱
 雅虎助手¨D搜索、杀毒、防骚扰  
Reply | Threaded
Open this post in threaded view
|

Re: About Index using

Igor Tandetnik
Bo Lin wrote:

> Here is a sql string ,like : select * from test where (a=0 or a=1)
> and b=1 ;
> and  column a range from 1-10000, and  column b range from 0-1.  and
> DB has about 300,000 record with colum a and b configured randomly .
>
> Two index is create on "test" table . First is on "column b" and the
> second is on "a,b"
>
> how can I use the second index, can sqlite can support "select" to
> specify certain index ?

I suspect "or" confuses the optimizer. SQLite's query planner is not as
sofisticated as that of some DBMS's (that are usually much larger and
infinitely more expensive).

I haven't tried it, but I strongly suspect if you restate your query as
shown below, it would use the index you want:

select * from test where a=0 and b=1
union all
select * from test where a=1 and b=1

Another possibility along the same lines is

select * from
(select 0 a union all select 1 a) ids join test
    on (test.a = ids.a and test.b = 1)


Also, running ANALYZE statement may help. It gathers statistics about
the value distribution in your indexes, which the query planner
presumably considers. I haven't personally seen it help much, but
perhaps in your situation with very different distributions it would.

Igor Tandetnik

Reply | Threaded
Open this post in threaded view
|

Re: About Index using

Nathan Kurz
In reply to this post by Bo Lin
On Mon, Dec 12, 2005 at 09:48:21AM +0800, Bo Lin wrote:

> Here is a sql string ,like : select * from test where (a=0 or a=1)
> and b=1 ; and column a range from 1-10000, and column b range from
> 0-1.  and DB has about 300,000 record with colum a and b configured
> randomly .
>
> Two index is create on "test" table . First is on "column b" and the
> second is on "a,b"
>
>  but when I try to use "explain" , I found the first index is
> used. but obviousely if sqlite can use the second index , the
> performance can be improved a lot .
>
>  how can I use the second index, can sqlite can support "select" to
> specify certain index ?

I'm sorry I can't just offer you the correct answer, but have you
looked at <http://sqlite.org/optoverview.html>?  It gives some good
hints on how to the choice of indices is made.

Good luck,

--nate
Reply | Threaded
Open this post in threaded view
|

回复: Re: [sqlite] About Index using

Bo Lin
thanks a lot for your help !

Nathan Kurz <[hidden email]> 写道: On Mon, Dec 12, 2005 at 09:48:21AM +0800, Bo Lin wrote:

> Here is a sql string ,like : select * from test where (a=0 or a=1)
> and b=1 ; and column a range from 1-10000, and column b range from
> 0-1.  and DB has about 300,000 record with colum a and b configured
> randomly .
>
> Two index is create on "test" table . First is on "column b" and the
> second is on "a,b"
>
>  but when I try to use "explain" , I found the first index is
> used. but obviousely if sqlite can use the second index , the
> performance can be improved a lot .
>
>  how can I use the second index, can sqlite can support "select" to
> specify certain index ?

I'm sorry I can't just offer you the correct answer, but have you
looked at ?  It gives some good
hints on how to the choice of indices is made.

Good luck,

--nate



               
---------------------------------
 雅虎免费G邮箱-中国第一绝无垃圾邮件骚扰超大邮箱
 雅虎助手¨D搜索、杀毒、防骚扰  
Reply | Threaded
Open this post in threaded view
|

Re: About Index using

D. Richard Hipp
In reply to this post by Bo Lin
Bo Lin <[hidden email]> wrote:

> Hi ,
>
> Here is a sql string ,like : select * from test where (a=0 or a=1) and b=1 ;
> and  column a range from 1-10000, and  column b range from 0-1.  and DB has about 300,000 record with colum a and b configured randomly .
>
> Two index is create on "test" table . First is on "column b" and the second is on "a,b"
>
> but when I try to use "explain" , I found   the first index is used. but obviousely if sqlite can use the second index , the performance can be improved a lot .
>
> how can I use the second index, can sqlite can support "select" to specify certain index ?
>

SQLite will only use one index at a time.  So create your index
like this:

   CREATE INDEX idx ON test(b,a);

It *should* also work to create the index this way:

   CREATE INDEX idx ON test(a,b);

But I just tried this and there appears to be a bug in the optimizer
that is preventing it from working properly.  I'll look into it.
--
D. Richard Hipp <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

回复: [sqlite] Re: About Index using

Bo Lin
In reply to this post by Igor Tandetnik
Hi Igor

I just found a strange case , can you give me some explaination ?

I have a Table with about 800,000 record
DB Version : 3. 2. 7

The Sql looks like
    select x, y, sum(z)/1000 as bw from aa where        
        a=1 and b=1 and
       c =1 and d= 6 group by x, y having count(*) > 1 order by         bw desc  
       limit 10

Column "d" is all set to 6
1/3 of total  record is (a=1,b=1,c=1)
1/3 of total  record is (a=2,b=2,c=2)
1/3 of total  record is (a=3,b=3,c=3)

I just test the above Sql under different index (each time, I just create one index and drop another)
1)  create index 1 on  aa (d)
2)  create index 2 on  aa (a,b,c,d)

Expect :
    I think Using second index should be much faster than using first index.
Actually
    That two index almost give the same performance. both of them will take about 15 seconds

Why?  thanks in advice  :)


Igor Tandetnik <[hidden email]> 写道: Bo Lin wrote:

> Here is a sql string ,like : select * from test where (a=0 or a=1)
> and b=1 ;
> and  column a range from 1-10000, and  column b range from 0-1.  and
> DB has about 300,000 record with colum a and b configured randomly .
>
> Two index is create on "test" table . First is on "column b" and the
> second is on "a,b"
>
> how can I use the second index, can sqlite can support "select" to
> specify certain index ?

I suspect "or" confuses the optimizer. SQLite's query planner is not as
sofisticated as that of some DBMS's (that are usually much larger and
infinitely more expensive).

I haven't tried it, but I strongly suspect if you restate your query as
shown below, it would use the index you want:

select * from test where a=0 and b=1
union all
select * from test where a=1 and b=1

Another possibility along the same lines is

select * from
(select 0 a union all select 1 a) ids join test
    on (test.a = ids.a and test.b = 1)


Also, running ANALYZE statement may help. It gathers statistics about
the value distribution in your indexes, which the query planner
presumably considers. I haven't personally seen it help much, but
perhaps in your situation with very different distributions it would.

Igor Tandetnik




__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

回复: Re: [sqlite] About Index using

Bo Lin
In reply to this post by Nathan Kurz
Hi Nathan
 
 I just found a strange case , can you give me some explaination ?
 
 I have a Table with about 800,000 record
 DB Version : 3. 2. 7
 
 The Sql looks like
     select x, y, sum(z)/1000 as bw from aa where        
         a=1 and b=1 and
        c =1 and d= 6 group by x, y having count(*) > 1 order by         bw desc  
        limit 10
 
 Column "d" is all set to 6
 1/3 of total  record is (a=1,b=1,c=1)
 1/3 of total  record is (a=2,b=2,c=2)
 1/3 of total  record is (a=3,b=3,c=3)
 
 I just test the above Sql under different index (each time, I just create one index and drop another)
 1)  create index 1 on  aa (d)
 2)  create index 2 on  aa (a,b,c,d)
 
 Expect :
     I think Using second index should be much faster than using first index.
 Actually
     That two index almost give the same performance. both of them will take about 15 seconds
 
 Why?  thanks in advice  :)

Nathan Kurz <[hidden email]> 写道: On Mon, Dec 12, 2005 at 09:48:21AM +0800, Bo Lin wrote:

> Here is a sql string ,like : select * from test where (a=0 or a=1)
> and b=1 ; and column a range from 1-10000, and column b range from
> 0-1.  and DB has about 300,000 record with colum a and b configured
> randomly .
>
> Two index is create on "test" table . First is on "column b" and the
> second is on "a,b"
>
>  but when I try to use "explain" , I found the first index is
> used. but obviousely if sqlite can use the second index , the
> performance can be improved a lot .
>
>  how can I use the second index, can sqlite can support "select" to
> specify certain index ?

I'm sorry I can't just offer you the correct answer, but have you
looked at ?  It gives some good
hints on how to the choice of indices is made.

Good luck,

--nate



__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

回复: Re: [sqlite] About Index using

Bo Lin
In reply to this post by D. Richard Hipp
Hi Drh
 
 I just found a strange case , can you give me some explaination ?
 
 I have a Table with about 800,000 record
 DB Version : 3. 2. 7
 
 The Sql looks like
     select x, y, sum(z)/1000 as bw from aa where        
         a=1 and b=1 and
        c =1 and d= 6 group by x, y having count(*) > 1 order by         bw desc  
        limit 10
 
 Column "d" is all set to 6
 1/3 of total  record is (a=1,b=1,c=1)
 1/3 of total  record is (a=2,b=2,c=2)
 1/3 of total  record is (a=3,b=3,c=3)
 
 I just test the above Sql under different index (each time, I just create one index and drop another)
 1)  create index 1 on  aa (d)
 2)  create index 2 on  aa (a,b,c,d)
 
 Expect :
     I think Using second index should be much faster than using first index.
 Actually
     That two index almost give the same performance. both of them will take about 15 seconds
 
 Why?  thanks in advice  :)



[hidden email] 写道: Bo Lin  wrote:

> Hi ,
>
> Here is a sql string ,like : select * from test where (a=0 or a=1) and b=1 ;
> and  column a range from 1-10000, and  column b range from 0-1.  and DB has about 300,000 record with colum a and b configured randomly .
>
> Two index is create on "test" table . First is on "column b" and the second is on "a,b"
>
> but when I try to use "explain" , I found   the first index is used. but obviousely if sqlite can use the second index , the performance can be improved a lot .
>
> how can I use the second index, can sqlite can support "select" to specify certain index ?
>

SQLite will only use one index at a time.  So create your index
like this:

   CREATE INDEX idx ON test(b,a);

It *should* also work to create the index this way:

   CREATE INDEX idx ON test(a,b);

But I just tried this and there appears to be a bug in the optimizer
that is preventing it from working properly.  I'll look into it.
--
D. Richard Hipp




__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Strange case in 3.2.7 , why Index create on multiple column isn't faster than index created on Single column

Bo Lin
In reply to this post by D. Richard Hipp

I have a Table with about 800,000 record
DB Version : 3. 2. 7

The Sql looks like
    select x, y, sum(z)/1000 as bw from aa where        
        a=1 and b=1 and
       c =1 and d= 6 group by x, y having count(*) > 1 order by         bw desc
       limit 10

Column "d" is all set to 6
1/3 of total  record is (a=1,b=1,c=1)
1/3 of total  record is (a=2,b=2,c=2)
1/3 of total  record is (a=3,b=3,c=3)

I just test the above Sql under different index (each time, I just create one index and drop another)
1)  create index 1 on  aa (d)
2)  create index 2 on  aa (a,b,c,d)

Expect :
    I think Using second index should be much faster than using first index.
Actually
    That two index almost give the same performance. both of them will take about 15 seconds

Why?  thanks in advice  :)
__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Re: 回复: [sqlite] Re: About Index using

Igor Tandetnik
In reply to this post by Bo Lin
Bo Lin <blin_beijing-/[hidden email]> wrote:

> I just found a strange case , can you give me some explaination ?
>
> The Sql looks like
>    select x, y, sum(z)/1000 as bw from aa where
>        a=1 and b=1 and
>       c =1 and d= 6 group by x, y having count(*) > 1 order by      
>       bw desc limit 10
>
> Column "d" is all set to 6
> 1/3 of total  record is (a=1,b=1,c=1)
> 1/3 of total  record is (a=2,b=2,c=2)
> 1/3 of total  record is (a=3,b=3,c=3)

Where do x and y come from, what are their values?

Igor Tandetnik
Reply | Threaded
Open this post in threaded view
|

回复: [sqlite] Re: 回复: [sqlite] Re: About Index using

Bo Lin
x,y are both column of Table "aa", they may have random value .

And you think that will affect the performance.?

What I think is Sqlite will first use "where cluase" to get limited value ,then use "group by" in the same .

so , the perfomance depend "where" clause , correct?



Igor Tandetnik <[hidden email]> 写道: Bo Lin  wrote:

> I just found a strange case , can you give me some explaination ?
>
> The Sql looks like
>    select x, y, sum(z)/1000 as bw from aa where
>        a=1 and b=1 and
>       c =1 and d= 6 group by x, y having count(*) > 1 order by      
>       bw desc limit 10
>
> Column "d" is all set to 6
> 1/3 of total  record is (a=1,b=1,c=1)
> 1/3 of total  record is (a=2,b=2,c=2)
> 1/3 of total  record is (a=3,b=3,c=3)

Where do x and y come from, what are their values?

Igor Tandetnik



__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Re: 回复: [sqlite] Re: 回复: [sqlite] Re: About Index using

Igor Tandetnik
Bo Lin wrote:
> x,y are both column of Table "aa", they may have random value .
>
> And you think that will affect the performance.?
>
> What I think is Sqlite will first use "where cluase" to get limited
> value ,then use "group by" in the same .

Maybe, maybe not. Use EXPLAIN to find out.

Igor Tandetnik