Is it possible to have query planner use FTS Index for ORDER BY

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Is it possible to have query planner use FTS Index for ORDER BY

Dave Waters
I have a large FTS4 table (around 200 million rows and growing).  A simple
query (SELECT * FROM main WHERE main MATCH 'data') returns in less than a
second.  However, if an ORDER BY is added (SELECT * FROM main WHERE main
MATCH 'data' ORDER BY udate) it will never return (after 20 mins, I
canceled the query).

EXPLAIN QUERY PLAN on the ORDER BY statement shows 'USE TEMP B-TREE FOR
ORDER BY'.  Shouldn't it attempt to use the available FTS Index for ORDER
BY?  If not, Is there another method (or work around) to get the data back
in order?

Thank you.

Dave
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have query planner use FTS Index for ORDER BY

Dan Kennedy-4
On 08/20/2015 12:38 PM, David Waters wrote:
> I have a large FTS4 table (around 200 million rows and growing).  A simple
> query (SELECT * FROM main WHERE main MATCH 'data') returns in less than a
> second.  However, if an ORDER BY is added (SELECT * FROM main WHERE main
> MATCH 'data' ORDER BY udate) it will never return (after 20 mins, I
> canceled the query).
>
> EXPLAIN QUERY PLAN on the ORDER BY statement shows 'USE TEMP B-TREE FOR
> ORDER BY'.  Shouldn't it attempt to use the available FTS Index for ORDER
> BY?

I don't see how it could. The FTS index is not a list of rows sorted by
udate.

> If not, Is there another method (or work around) to get the data back
> in order?

You could use an external content FTS index. Then put a regular index on
the udate column of your external content table and query it directly
for non-fulltext queries.

   https://www.sqlite.org/fts3.html#section_6_2_2

Dan.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have query planner use FTS Index for ORDER BY

Dave Waters
Sorry.  I should have specified that that 'udate' is one of the indexed
columns of the FTS4 table main.

The goal is to do a FTS query and be able to ensure the results are ordered
the same each time (via the ORDER BY).  It seemed at first to me that the
FTS index contains what is needed for the ORDER BY, but that is likely
incorrect.

Thanks.

DW

On Thursday, August 20, 2015, Dan Kennedy <[hidden email]> wrote:

> On 08/20/2015 12:38 PM, David Waters wrote:
>
>> I have a large FTS4 table (around 200 million rows and growing).  A simple
>> query (SELECT * FROM main WHERE main MATCH 'data') returns in less than a
>> second.  However, if an ORDER BY is added (SELECT * FROM main WHERE main
>> MATCH 'data' ORDER BY udate) it will never return (after 20 mins, I
>> canceled the query).
>>
>> EXPLAIN QUERY PLAN on the ORDER BY statement shows 'USE TEMP B-TREE FOR
>> ORDER BY'.  Shouldn't it attempt to use the available FTS Index for ORDER
>> BY?
>>
>
> I don't see how it could. The FTS index is not a list of rows sorted by
> udate.
>
> If not, Is there another method (or work around) to get the data back
>> in order?
>>
>
> You could use an external content FTS index. Then put a regular index on
> the udate column of your external content table and query it directly for
> non-fulltext queries.
>
>   https://www.sqlite.org/fts3.html#section_6_2_2
>
> Dan.
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


--
Dave
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have query planner use FTS Index for ORDER BY

Vladimir Vissoultchev
Try to rewrite the query like this

SELECT * FROM main
WHERE rowid IN (SELECT rowid FROM main WHERE main MATCH 'data')
ORDER BY udate

cheers,
</wqw>

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of David
Waters
Sent: Thursday, August 20, 2015 4:47 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Is it possible to have query planner use FTS Index for
ORDER BY

Sorry.  I should have specified that that 'udate' is one of the indexed
columns of the FTS4 table main.

The goal is to do a FTS query and be able to ensure the results are ordered
the same each time (via the ORDER BY).  It seemed at first to me that the
FTS index contains what is needed for the ORDER BY, but that is likely
incorrect.

Thanks.

DW

On Thursday, August 20, 2015, Dan Kennedy <[hidden email]> wrote:

> On 08/20/2015 12:38 PM, David Waters wrote:
>
>> I have a large FTS4 table (around 200 million rows and growing).  A
>> simple query (SELECT * FROM main WHERE main MATCH 'data') returns in
>> less than a second.  However, if an ORDER BY is added (SELECT * FROM
>> main WHERE main MATCH 'data' ORDER BY udate) it will never return
>> (after 20 mins, I canceled the query).
>>
>> EXPLAIN QUERY PLAN on the ORDER BY statement shows 'USE TEMP B-TREE
>> FOR ORDER BY'.  Shouldn't it attempt to use the available FTS Index
>> for ORDER BY?
>>
>
> I don't see how it could. The FTS index is not a list of rows sorted
> by udate.
>
> If not, Is there another method (or work around) to get the data back
>> in order?
>>
>
> You could use an external content FTS index. Then put a regular index
> on the udate column of your external content table and query it
> directly for non-fulltext queries.
>
>   https://www.sqlite.org/fts3.html#section_6_2_2
>
> Dan.
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


--
Dave
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have query planner use FTS Index for ORDER BY

Dave Waters
Thanks Dan and Vladimir.  A combination of the two approaches is required.
I'll detail it here for future reference:

Using the sub query against the FTS table as suggested:

SELECT * FROM main
WHERE rowid IN (SELECT rowid FROM main WHERE main MATCH 'data')
ORDER BY udate

Still causes the USE TEMP B-TREE FOR ORDER BY.  As Dan pointed out, the FTS
"index" isn't a traditional SQL Index.  However, by using the suggested
content table to populate the FTS table, I was able to get the desired
result, with just a few more steps (and disk space) required:

CREATE TABLE IF NOT EXISTS content1 (id INTEGER PRIMARY KEY, udate, msg)
CREATE VIRTUAL TABLE IF NOT EXISTS main USING
fts4(content="content1",id,udate,msg)
CREATE TRIGGER IF NOT EXISTS content_trig1 AFTER INSERT ON content1 BEGIN
INSERT INTO main(docid,udate,msg) VALUES(new.rowid, new.udate,new.msg); END
CREATE INDEX IF NOT EXISTS content1_index on content1 (udate)

To query with ORDER BY, I had to also add the "INDEXED BY" to get the Query
Planner to use the Index, and not a temp B-tree.

SELECT * FROM content1 INDEXED BY content1_index WHERE rowid IN (SELECT
rowid FROM main WHERE main MATCH 'data to match') ORDER BY odate"

And I now have fast (and sorted) FTS queries again, with no temp B-tree!

Thanks.

Dave



On Thu, Aug 20, 2015 at 11:23 AM, Vladimir Vissoultchev <[hidden email]>
wrote:

> Try to rewrite the query like this
>
> SELECT * FROM main
> WHERE rowid IN (SELECT rowid FROM main WHERE main MATCH 'data')
> ORDER BY udate
>
> cheers,
> </wqw>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of David
> Waters
> Sent: Thursday, August 20, 2015 4:47 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Is it possible to have query planner use FTS Index
> for
> ORDER BY
>
> Sorry.  I should have specified that that 'udate' is one of the indexed
> columns of the FTS4 table main.
>
> The goal is to do a FTS query and be able to ensure the results are ordered
> the same each time (via the ORDER BY).  It seemed at first to me that the
> FTS index contains what is needed for the ORDER BY, but that is likely
> incorrect.
>
> Thanks.
>
> DW
>
> On Thursday, August 20, 2015, Dan Kennedy <[hidden email]> wrote:
>
> > On 08/20/2015 12:38 PM, David Waters wrote:
> >
> >> I have a large FTS4 table (around 200 million rows and growing).  A
> >> simple query (SELECT * FROM main WHERE main MATCH 'data') returns in
> >> less than a second.  However, if an ORDER BY is added (SELECT * FROM
> >> main WHERE main MATCH 'data' ORDER BY udate) it will never return
> >> (after 20 mins, I canceled the query).
> >>
> >> EXPLAIN QUERY PLAN on the ORDER BY statement shows 'USE TEMP B-TREE
> >> FOR ORDER BY'.  Shouldn't it attempt to use the available FTS Index
> >> for ORDER BY?
> >>
> >
> > I don't see how it could. The FTS index is not a list of rows sorted
> > by udate.
> >
> > If not, Is there another method (or work around) to get the data back
> >> in order?
> >>
> >
> > You could use an external content FTS index. Then put a regular index
> > on the udate column of your external content table and query it
> > directly for non-fulltext queries.
> >
> >   https://www.sqlite.org/fts3.html#section_6_2_2
> >
> > Dan.
> >
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
> --
> Dave
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



--
Dave
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have query planner use FTS Index for ORDER BY

Vladimir Vissoultchev
Actually, doubt it's possible to use `content1_index` for ordering in your
case once `content` rows are filtered by the IN operator. Most probably the
index is just ignored here.

Do you get any measurable performance improvement with vs w/o it?

cheers,
</wqw>

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of David
Waters
Sent: Thursday, August 20, 2015 8:58 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Is it possible to have query planner use FTS Index for
ORDER BY

Thanks Dan and Vladimir.  A combination of the two approaches is required.
I'll detail it here for future reference:

Using the sub query against the FTS table as suggested:

SELECT * FROM main
WHERE rowid IN (SELECT rowid FROM main WHERE main MATCH 'data') ORDER BY
udate

Still causes the USE TEMP B-TREE FOR ORDER BY.  As Dan pointed out, the FTS
"index" isn't a traditional SQL Index.  However, by using the suggested
content table to populate the FTS table, I was able to get the desired
result, with just a few more steps (and disk space) required:

CREATE TABLE IF NOT EXISTS content1 (id INTEGER PRIMARY KEY, udate, msg)
CREATE VIRTUAL TABLE IF NOT EXISTS main USING
fts4(content="content1",id,udate,msg)
CREATE TRIGGER IF NOT EXISTS content_trig1 AFTER INSERT ON content1 BEGIN
INSERT INTO main(docid,udate,msg) VALUES(new.rowid, new.udate,new.msg); END
CREATE INDEX IF NOT EXISTS content1_index on content1 (udate)

To query with ORDER BY, I had to also add the "INDEXED BY" to get the Query
Planner to use the Index, and not a temp B-tree.

SELECT * FROM content1 INDEXED BY content1_index WHERE rowid IN (SELECT
rowid FROM main WHERE main MATCH 'data to match') ORDER BY odate"

And I now have fast (and sorted) FTS queries again, with no temp B-tree!

Thanks.

Dave



On Thu, Aug 20, 2015 at 11:23 AM, Vladimir Vissoultchev <[hidden email]>
wrote:

> Try to rewrite the query like this
>
> SELECT * FROM main
> WHERE rowid IN (SELECT rowid FROM main WHERE main MATCH 'data') ORDER
> BY udate
>
> cheers,
> </wqw>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> David Waters
> Sent: Thursday, August 20, 2015 4:47 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Is it possible to have query planner use FTS
> Index for ORDER BY
>
> Sorry.  I should have specified that that 'udate' is one of the
> indexed columns of the FTS4 table main.
>
> The goal is to do a FTS query and be able to ensure the results are
> ordered the same each time (via the ORDER BY).  It seemed at first to
> me that the FTS index contains what is needed for the ORDER BY, but
> that is likely incorrect.
>
> Thanks.
>
> DW
>
> On Thursday, August 20, 2015, Dan Kennedy <[hidden email]> wrote:
>
> > On 08/20/2015 12:38 PM, David Waters wrote:
> >
> >> I have a large FTS4 table (around 200 million rows and growing).  A
> >> simple query (SELECT * FROM main WHERE main MATCH 'data') returns
> >> in less than a second.  However, if an ORDER BY is added (SELECT *
> >> FROM main WHERE main MATCH 'data' ORDER BY udate) it will never
> >> return (after 20 mins, I canceled the query).
> >>
> >> EXPLAIN QUERY PLAN on the ORDER BY statement shows 'USE TEMP B-TREE
> >> FOR ORDER BY'.  Shouldn't it attempt to use the available FTS Index
> >> for ORDER BY?
> >>
> >
> > I don't see how it could. The FTS index is not a list of rows sorted
> > by udate.
> >
> > If not, Is there another method (or work around) to get the data
> > back
> >> in order?
> >>
> >
> > You could use an external content FTS index. Then put a regular
> > index on the udate column of your external content table and query
> > it directly for non-fulltext queries.
> >
> >   https://www.sqlite.org/fts3.html#section_6_2_2
> >
> > Dan.
> >
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
> --
> Dave
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



--
Dave
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to have query planner use FTS Index for ORDER BY

Dave Waters
I did.  My last post was based off testing it live - it works beautifully.
Lighting quick and sorted without a temp B-TREE:

EXPLAIN QUERY PLAN is now:

SCAN TABLE content1 USING INDEX content1_index (~100000 rows)
EXECUTE LIST SUBQUERY 1
SCAN TABLE main VIRTUAL TABLE INDEX 6: (~0 rows)

Dave


On Thu, Aug 20, 2015 at 2:36 PM, Vladimir Vissoultchev <[hidden email]>
wrote:

> Actually, doubt it's possible to use `content1_index` for ordering in your
> case once `content` rows are filtered by the IN operator. Most probably the
> index is just ignored here.
>
> Do you get any measurable performance improvement with vs w/o it?
>
> cheers,
> </wqw>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of David
> Waters
> Sent: Thursday, August 20, 2015 8:58 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Is it possible to have query planner use FTS Index
> for
> ORDER BY
>
> Thanks Dan and Vladimir.  A combination of the two approaches is required.
> I'll detail it here for future reference:
>
> Using the sub query against the FTS table as suggested:
>
> SELECT * FROM main
> WHERE rowid IN (SELECT rowid FROM main WHERE main MATCH 'data') ORDER BY
> udate
>
> Still causes the USE TEMP B-TREE FOR ORDER BY.  As Dan pointed out, the FTS
> "index" isn't a traditional SQL Index.  However, by using the suggested
> content table to populate the FTS table, I was able to get the desired
> result, with just a few more steps (and disk space) required:
>
> CREATE TABLE IF NOT EXISTS content1 (id INTEGER PRIMARY KEY, udate, msg)
> CREATE VIRTUAL TABLE IF NOT EXISTS main USING
> fts4(content="content1",id,udate,msg)
> CREATE TRIGGER IF NOT EXISTS content_trig1 AFTER INSERT ON content1 BEGIN
> INSERT INTO main(docid,udate,msg) VALUES(new.rowid, new.udate,new.msg); END
> CREATE INDEX IF NOT EXISTS content1_index on content1 (udate)
>
> To query with ORDER BY, I had to also add the "INDEXED BY" to get the Query
> Planner to use the Index, and not a temp B-tree.
>
> SELECT * FROM content1 INDEXED BY content1_index WHERE rowid IN (SELECT
> rowid FROM main WHERE main MATCH 'data to match') ORDER BY odate"
>
> And I now have fast (and sorted) FTS queries again, with no temp B-tree!
>
> Thanks.
>
> Dave
>
>
>
> On Thu, Aug 20, 2015 at 11:23 AM, Vladimir Vissoultchev <[hidden email]>
> wrote:
>
> > Try to rewrite the query like this
> >
> > SELECT * FROM main
> > WHERE rowid IN (SELECT rowid FROM main WHERE main MATCH 'data') ORDER
> > BY udate
> >
> > cheers,
> > </wqw>
> >
> > -----Original Message-----
> > From: [hidden email]
> > [mailto:[hidden email]] On Behalf Of
> > David Waters
> > Sent: Thursday, August 20, 2015 4:47 PM
> > To: General Discussion of SQLite Database
> > Subject: Re: [sqlite] Is it possible to have query planner use FTS
> > Index for ORDER BY
> >
> > Sorry.  I should have specified that that 'udate' is one of the
> > indexed columns of the FTS4 table main.
> >
> > The goal is to do a FTS query and be able to ensure the results are
> > ordered the same each time (via the ORDER BY).  It seemed at first to
> > me that the FTS index contains what is needed for the ORDER BY, but
> > that is likely incorrect.
> >
> > Thanks.
> >
> > DW
> >
> > On Thursday, August 20, 2015, Dan Kennedy <[hidden email]> wrote:
> >
> > > On 08/20/2015 12:38 PM, David Waters wrote:
> > >
> > >> I have a large FTS4 table (around 200 million rows and growing).  A
> > >> simple query (SELECT * FROM main WHERE main MATCH 'data') returns
> > >> in less than a second.  However, if an ORDER BY is added (SELECT *
> > >> FROM main WHERE main MATCH 'data' ORDER BY udate) it will never
> > >> return (after 20 mins, I canceled the query).
> > >>
> > >> EXPLAIN QUERY PLAN on the ORDER BY statement shows 'USE TEMP B-TREE
> > >> FOR ORDER BY'.  Shouldn't it attempt to use the available FTS Index
> > >> for ORDER BY?
> > >>
> > >
> > > I don't see how it could. The FTS index is not a list of rows sorted
> > > by udate.
> > >
> > > If not, Is there another method (or work around) to get the data
> > > back
> > >> in order?
> > >>
> > >
> > > You could use an external content FTS index. Then put a regular
> > > index on the udate column of your external content table and query
> > > it directly for non-fulltext queries.
> > >
> > >   https://www.sqlite.org/fts3.html#section_6_2_2
> > >
> > > Dan.
> > >
> > > _______________________________________________
> > > sqlite-users mailing list
> > > [hidden email]
> > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> > >
> >
> >
> > --
> > Dave
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
>
> --
> Dave
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



--
Dave
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users