Intersecting multiple queries

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Intersecting multiple queries

metanym
Hi,

I am building a list of UUIDs from multiple queries of the form:

SELECT uuid FROM Data WHERE filter LIKE ?

with a different bound parameter each time.

In app-space code, I'm getting the results of these queries and
intersecting them, so that the final list contains only UUIDs returned by
all of the queries.

I'm then populating a temporary table so that a further query can return
only rows matching those UUIDs.

What I'm wondering is if there's a shortcut to avoid having to build the
UUID list in app code -- whether I can perform an intersection in a query
to build that temporary table without involving app-space code. Or even
avoid having to build the temporary table at all?

Many thanks,
Hamish
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Intersecting multiple queries

Simon Slavin-3
On 28 Feb 2020, at 11:02pm, Hamish Allan <[hidden email]> wrote:

> What I'm wondering is if there's a shortcut to avoid having to build the UUID list in app code

I would probably start by building a list of the search patterns then see what I could do with it:

(pattern1,pattern2,pattern3)

Would a Common Table Expression do it ?

<https://sqlite.org/lang_with.html>

WITH pattern in (pattern1,pattern2,pattern3)
 … UNION ALL …
   SELECT uuid FROM Data WHERE filter LIKE pattern

Unfortunately I've not used them and don't know the right way to do it.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Intersecting multiple queries

Keith Medcalf
In reply to this post by metanym

select stuff from data where uuid in (select uuid from data where ... INTERSECT select uuid from data where ... INTERSECT select uuid from data where ... <repeat until finished>);

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <[hidden email]> On
>Behalf Of Hamish Allan
>Sent: Friday, 28 February, 2020 16:02
>To: SQLite mailing list <[hidden email]>
>Subject: [sqlite] Intersecting multiple queries
>
>Hi,
>
>I am building a list of UUIDs from multiple queries of the form:
>
>SELECT uuid FROM Data WHERE filter LIKE ?
>
>with a different bound parameter each time.
>
>In app-space code, I'm getting the results of these queries and
>intersecting them, so that the final list contains only UUIDs returned by
>all of the queries.
>
>I'm then populating a temporary table so that a further query can return
>only rows matching those UUIDs.
>
>What I'm wondering is if there's a shortcut to avoid having to build the
>UUID list in app code -- whether I can perform an intersection in a query
>to build that temporary table without involving app-space code. Or even
>avoid having to build the temporary table at all?
>
>Many thanks,
>Hamish
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Intersecting multiple queries

Keith Medcalf

Of course, you could just as well do:

select * from data where (...) AND (...) AND (...) <repeat until finished>;

and do away with all the extraneous stuff.

In other words, why would one want to do:

select * from data where uuid in (select uuid from data where twit == 1 INTERSECT select uuid from data where twat == 1 INTERSECT select uuid from data where lastname like 'cricket%' INTERSECT select uuid from data where not firstname like 'jimmy%')

when one could simply forgo all the crap and do it directly:

select * from data where (twit == 1) and (twat == 1) and (lastname like 'cricket%') and (not firstname like 'jimmy%');




--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: Keith Medcalf <[hidden email]>
>Sent: Friday, 28 February, 2020 17:37
>To: 'SQLite mailing list' <[hidden email]>
>Subject: RE: [sqlite] Intersecting multiple queries
>
>
>select stuff from data where uuid in (select uuid from data where ...
>INTERSECT select uuid from data where ... INTERSECT select uuid from data
>where ... <repeat until finished>);
>
>--
>The fact that there's a Highway to Hell but only a Stairway to Heaven
>says a lot about anticipated traffic volume.
>
>>-----Original Message-----
>>From: sqlite-users <[hidden email]> On
>>Behalf Of Hamish Allan
>>Sent: Friday, 28 February, 2020 16:02
>>To: SQLite mailing list <[hidden email]>
>>Subject: [sqlite] Intersecting multiple queries
>>
>>Hi,
>>
>>I am building a list of UUIDs from multiple queries of the form:
>>
>>SELECT uuid FROM Data WHERE filter LIKE ?
>>
>>with a different bound parameter each time.
>>
>>In app-space code, I'm getting the results of these queries and
>>intersecting them, so that the final list contains only UUIDs returned
>by
>>all of the queries.
>>
>>I'm then populating a temporary table so that a further query can return
>>only rows matching those UUIDs.
>>
>>What I'm wondering is if there's a shortcut to avoid having to build the
>>UUID list in app code -- whether I can perform an intersection in a
>query
>>to build that temporary table without involving app-space code. Or even
>>avoid having to build the temporary table at all?
>>
>>Many thanks,
>>Hamish
>>_______________________________________________
>>sqlite-users mailing list
>>[hidden email]
>>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Intersecting multiple queries

Simon Slavin-3
On 29 Feb 2020, at 12:45am, Keith Medcalf <[hidden email]> wrote:

> select * from data where (...) AND (...) AND (...) <repeat until finished>;

SELECT uuid FROM Data
   WHERE filter LIKE ?
      OR filter LIKE ?
      OR filter LIKE ?
      OR filter LIKE ? …

would probably be an efficient way to do it if you could construct your entire statement.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Intersecting multiple queries

metanym
In reply to this post by Keith Medcalf
On Sat, 29 Feb 2020 at 00:45, Keith Medcalf <[hidden email]> wrote:

>
> In other words, why would one want to do:
>
> select * from data where uuid in (select uuid from data where twit == 1
> INTERSECT select uuid from data where twat == 1 INTERSECT select uuid from
> data where lastname like 'cricket%' INTERSECT select uuid from data where
> not firstname like 'jimmy%')
>
> when one could simply forgo all the crap and do it directly:
>
> select * from data where (twit == 1) and (twat == 1) and (lastname like
> 'cricket%') and (not firstname like 'jimmy%');


Good question, and one that I should have thought to answer pre-emptively:

1. The results of these queries are displayed to a user in an app in
tabular format. The user can select filters and change sort order
independently. The sort table (actually a view) has unique uuids whereas
the filter table can have multiple entries for each uuid.

So it made sense to me to build a temporary table with the unique uuid
results of filtering, then JOIN that table with the sort table for
presentation. That way, the sort order can change without having to
re-query the filters. I may be over-thinking this.

2. This is for a mobile app and I’m using the C API. So I’ve been trying to
use prepared queries. There can be any number of LIKE filters applied
(including none) and I figured it would be more expensive to build and
parse a new statement each time than to apply a pre-compiled one several
times. Again, I may be making incorrect assumptions.

I suppose what I’m trying to do is essentially:

First filter:
INSERT INTO TemporaryTable (uuid) SELECT DISTINCT uuid FROM FilterData
WHERE filter LIKE ?

Subsequent filters:
DELETE FROM TemporaryTable WHERE uuid NOT IN (SELECT DISTINCT uuid FROM
FilterData WHERE filter LIKE ?)

The common case is 0 filters, but 1 or 2 filters is not uncommon. And I
should mention that at some point I’m going to be adding FTS, in case this
changes matters!

Are my assumptions faulty? All advice gratefully received.

Thanks,
Hamish


>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Intersecting multiple queries

Jens Alfke-2

> On Feb 28, 2020, at 11:49 PM, Hamish Allan <[hidden email]> wrote:
>
> Again, I may be making incorrect assumptions.

Remember the old Knuth quote about the danger of premature optimization. What’s the size of your data set? Have you tried making a dummy database of the same size and experimenting with queries?

Compiling a statement is very fast in my experience. Milliseconds. Doing it once in response to a user command probably won’t be noticeable.

It’s true that LIKE queries tend to be slow because they can’t, in general, be optimized. But there are two possibilities you haven’t mentioned:
1. If they all involve prefix matching (“x LIKE ‘prefix%’) then SQLite will use an index on column x to limit the search to rows where x starts with ‘prefix’.
2. You can create an index on a LIKE query with a specific pattern, I.e. “CREATE INDEX foo ON table (x LIKE ‘%something%’) where (x LIKE ‘%something%’)”. I haven’t tested, but this should speed up a query using that specific LIKE test.

FTS does make these types of queries faster, as long as you’re looking for whole words or word prefixes.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Intersecting multiple queries

metanym
Thanks Jens and everyone. I'll try the approach of compiling statements on
the fly.

Best wishes,
Hamish


On Sat, 29 Feb 2020 at 23:13, Jens Alfke <[hidden email]> wrote:

>
> > On Feb 28, 2020, at 11:49 PM, Hamish Allan <[hidden email]> wrote:
> >
> > Again, I may be making incorrect assumptions.
>
> Remember the old Knuth quote about the danger of premature optimization.
> What’s the size of your data set? Have you tried making a dummy database of
> the same size and experimenting with queries?
>
> Compiling a statement is very fast in my experience. Milliseconds. Doing
> it once in response to a user command probably won’t be noticeable.
>
> It’s true that LIKE queries tend to be slow because they can’t, in
> general, be optimized. But there are two possibilities you haven’t
> mentioned:
> 1. If they all involve prefix matching (“x LIKE ‘prefix%’) then SQLite
> will use an index on column x to limit the search to rows where x starts
> with ‘prefix’.
> 2. You can create an index on a LIKE query with a specific pattern, I.e.
> “CREATE INDEX foo ON table (x LIKE ‘%something%’) where (x LIKE
> ‘%something%’)”. I haven’t tested, but this should speed up a query using
> that specific LIKE test.
>
> FTS does make these types of queries faster, as long as you’re looking for
> whole words or word prefixes.
>
> —Jens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users