Using multi-value indexes for subset queries

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Using multi-value indexes for subset queries

wmertens
Hi all,

in experimenting with indexes I found that if you create an index on (a, b)
and then SELECT * FROM data WHERE a = 1 AND B = 2, it will use the index,
great.

However, if you write SELECT * FROM data WHERE a = 1, it won't use the
index. If you write SELECT * FROM data WHERE a = 1 AND B !=
some_impossible_value, it does use the index.

Is it not efficient to use an index, even if one of its values is not
constrained?

Case in point, I have an app that allows searching across 4 of parameters,
and I am wondering if I could just use a single 4-valued index, searching
for != impossible_val for unconstrained parameters. Surely that would be a
better use of resources than indexes on all combinations?

And if that is indeed the case, would it not be good that sqlite
automatically checks multi-value indexes where it can't find a specific
index?

Wondering Wout.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Using multi-value indexes for subset queries

Igor Tandetnik-2
On 8/9/2017 10:50 AM, Wout Mertens wrote:
> in experimenting with indexes I found that if you create an index on (a, b)
> and then SELECT * FROM data WHERE a = 1 AND B = 2, it will use the index,
> great.
>
> However, if you write SELECT * FROM data WHERE a = 1, it won't use the
> index.

Are you sure? How do you determine that? My experience is different.

> Case in point, I have an app that allows searching across 4 of parameters,
> and I am wondering if I could just use a single 4-valued index

If you have an index on (a, b, c, d), it'll help for conditions like "a=1" or "a=1 and b=2" or "a=1 and b=2 and c=3" - conditions that use a prefix on an index. But it won't help with "b=2" or "c=3" or "b=2 and c=3".

Think of it this way. Imagine you have a phone book, with entries sorted by last name and then first name. In this book, it's easy to find people named "Smith, John", or all people with last name of Smith - but it won't at all help to find all people with first name of John.
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Using multi-value indexes for subset queries

Hick Gunter
In reply to this post by wmertens
Your experiments are not reproducible unless you provide at least an indication of the schema.

Most probably, something you have not yet considered/revealed makes using the index to look up a irrelevant. Maybe a is declared as "integer primary key", making it an alias of the rowid.

-----Urspr√ľngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Wout Mertens
Gesendet: Mittwoch, 09. August 2017 16:51
An: SQLite mailing list <[hidden email]>
Betreff: [sqlite] Using multi-value indexes for subset queries

Hi all,

in experimenting with indexes I found that if you create an index on (a, b) and then SELECT * FROM data WHERE a = 1 AND B = 2, it will use the index, great.

However, if you write SELECT * FROM data WHERE a = 1, it won't use the index. If you write SELECT * FROM data WHERE a = 1 AND B != some_impossible_value, it does use the index.

Is it not efficient to use an index, even if one of its values is not constrained?

Case in point, I have an app that allows searching across 4 of parameters, and I am wondering if I could just use a single 4-valued index, searching for != impossible_val for unconstrained parameters. Surely that would be a better use of resources than indexes on all combinations?

And if that is indeed the case, would it not be good that sqlite automatically checks multi-value indexes where it can't find a specific index?

Wondering Wout.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: [hidden email]

This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Using multi-value indexes for subset queries

David Raymond
In reply to this post by wmertens
There's a guideline for what sort of things SQLite will look for in an index and a query. There are plenty of more advanced ways to make use of indexes I'm sure, but currently those would fall into "future optimization opportunities."
http://www.sqlite.org/optoverview.html

In your case the index on a, b is indeed a candidate for use in the query, but apparently the planner doesn't think it will be more beneficial than the full scan. This is where ANALYZE can help by letting the planner have more info about the indexes.

You can always use "INDEXED BY" to force it to use a specific index, and if you get an error along the lines of "Error: no query solution" then you'll know it definitely can't be used, either because of a typo while making it or some other reason.


-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Wout Mertens
Sent: Wednesday, August 09, 2017 10:51 AM
To: SQLite mailing list
Subject: [sqlite] Using multi-value indexes for subset queries

Hi all,

in experimenting with indexes I found that if you create an index on (a, b)
and then SELECT * FROM data WHERE a = 1 AND B = 2, it will use the index,
great.

However, if you write SELECT * FROM data WHERE a = 1, it won't use the
index. If you write SELECT * FROM data WHERE a = 1 AND B !=
some_impossible_value, it does use the index.

Is it not efficient to use an index, even if one of its values is not
constrained?

Case in point, I have an app that allows searching across 4 of parameters,
and I am wondering if I could just use a single 4-valued index, searching
for != impossible_val for unconstrained parameters. Surely that would be a
better use of resources than indexes on all combinations?

And if that is indeed the case, would it not be good that sqlite
automatically checks multi-value indexes where it can't find a specific
index?

Wondering Wout.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Using multi-value indexes for subset queries

wmertens
Indeed, in trying to reproduce on a simple table it does use the index,
even with json_extract values. I must be doing something wrong in my app,
thanks.



On Wed, Aug 9, 2017 at 5:07 PM David Raymond <[hidden email]>
wrote:

> There's a guideline for what sort of things SQLite will look for in an
> index and a query. There are plenty of more advanced ways to make use of
> indexes I'm sure, but currently those would fall into "future optimization
> opportunities."
> http://www.sqlite.org/optoverview.html
>
> In your case the index on a, b is indeed a candidate for use in the query,
> but apparently the planner doesn't think it will be more beneficial than
> the full scan. This is where ANALYZE can help by letting the planner have
> more info about the indexes.
>
> You can always use "INDEXED BY" to force it to use a specific index, and
> if you get an error along the lines of "Error: no query solution" then
> you'll know it definitely can't be used, either because of a typo while
> making it or some other reason.
>
>
> -----Original Message-----
> From: sqlite-users [mailto:[hidden email]]
> On Behalf Of Wout Mertens
> Sent: Wednesday, August 09, 2017 10:51 AM
> To: SQLite mailing list
> Subject: [sqlite] Using multi-value indexes for subset queries
>
> Hi all,
>
> in experimenting with indexes I found that if you create an index on (a, b)
> and then SELECT * FROM data WHERE a = 1 AND B = 2, it will use the index,
> great.
>
> However, if you write SELECT * FROM data WHERE a = 1, it won't use the
> index. If you write SELECT * FROM data WHERE a = 1 AND B !=
> some_impossible_value, it does use the index.
>
> Is it not efficient to use an index, even if one of its values is not
> constrained?
>
> Case in point, I have an app that allows searching across 4 of parameters,
> and I am wondering if I could just use a single 4-valued index, searching
> for != impossible_val for unconstrained parameters. Surely that would be a
> better use of resources than indexes on all combinations?
>
> And if that is indeed the case, would it not be good that sqlite
> automatically checks multi-value indexes where it can't find a specific
> index?
>
> Wondering Wout.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Using multi-value indexes for subset queries

wmertens
Sqlite is just really smart :) Doing a `count(*)` on my table with one
constraint of a two-valued index does a table scan and completes in 9ms,
and when I force use of the index, it's 100ms. I'll stop trying to
second-guess the query optimizer now ;)

On Wed, Aug 9, 2017 at 5:11 PM Wout Mertens <[hidden email]> wrote:

> Indeed, in trying to reproduce on a simple table it does use the index,
> even with json_extract values. I must be doing something wrong in my app,
> thanks.
>
>
>
> On Wed, Aug 9, 2017 at 5:07 PM David Raymond <[hidden email]>
> wrote:
>
>> There's a guideline for what sort of things SQLite will look for in an
>> index and a query. There are plenty of more advanced ways to make use of
>> indexes I'm sure, but currently those would fall into "future optimization
>> opportunities."
>> http://www.sqlite.org/optoverview.html
>>
>> In your case the index on a, b is indeed a candidate for use in the
>> query, but apparently the planner doesn't think it will be more beneficial
>> than the full scan. This is where ANALYZE can help by letting the planner
>> have more info about the indexes.
>>
>> You can always use "INDEXED BY" to force it to use a specific index, and
>> if you get an error along the lines of "Error: no query solution" then
>> you'll know it definitely can't be used, either because of a typo while
>> making it or some other reason.
>>
>>
>> -----Original Message-----
>> From: sqlite-users [mailto:[hidden email]]
>> On Behalf Of Wout Mertens
>> Sent: Wednesday, August 09, 2017 10:51 AM
>> To: SQLite mailing list
>> Subject: [sqlite] Using multi-value indexes for subset queries
>>
>> Hi all,
>>
>> in experimenting with indexes I found that if you create an index on (a,
>> b)
>> and then SELECT * FROM data WHERE a = 1 AND B = 2, it will use the index,
>> great.
>>
>> However, if you write SELECT * FROM data WHERE a = 1, it won't use the
>> index. If you write SELECT * FROM data WHERE a = 1 AND B !=
>> some_impossible_value, it does use the index.
>>
>> Is it not efficient to use an index, even if one of its values is not
>> constrained?
>>
>> Case in point, I have an app that allows searching across 4 of parameters,
>> and I am wondering if I could just use a single 4-valued index, searching
>> for != impossible_val for unconstrained parameters. Surely that would be a
>> better use of resources than indexes on all combinations?
>>
>> And if that is indeed the case, would it not be good that sqlite
>> automatically checks multi-value indexes where it can't find a specific
>> index?
>>
>> Wondering Wout.
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users