Common subexpression optimization of deterministic functions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
44 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Common subexpression optimization of deterministic functions

Jens Alfke-2
SQLite 3.19 doesn’t seem to coalesce identical calls to a deterministic function. For example, in this query, where `fl_value` is a function I’ve registered as SQLITE_DETERMINISTIC:

SELECT key FROM kv_default
WHERE fl_value(body, 'contact.address.state') = 'CA'
   OR fl_value(body, 'contact.address.state') = 'WA'

fl_value gets called twice per row in the table, with the same inputs both times of course. As fl_value is not a cheap function — it’s similar to json_value — it would be a noticeable speedup if it were evaluated only once per row.

Is there a way I can restructure these (automatically generated) queries to do the refactoring explicitly? Sort of like assigning to a temporary variable in an imperative language? It looks like a WITH clause lets me do this syntactically, but I'm not sure if it'll make a difference at runtime.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Richard Hipp-3
On 9/12/17, Jens Alfke <[hidden email]> wrote:

> SQLite 3.19 doesn’t seem to coalesce identical calls to a deterministic
> function. For example, in this query, where `fl_value` is a function I’ve
> registered as SQLITE_DETERMINISTIC:
>
> SELECT key FROM kv_default
> WHERE fl_value(body, 'contact.address.state') = 'CA'
>    OR fl_value(body, 'contact.address.state') = 'WA'
>
> fl_value gets called twice per row in the table, with the same inputs both
> times of course. As fl_value is not a cheap function — it’s similar to
> json_value — it would be a noticeable speedup if it were evaluated only once
> per row.

   ... WHERE fl_value(body,'contact.address.state') IN ('CA','WA');

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Jens Alfke-2


> On Sep 12, 2017, at 10:52 AM, Richard Hipp <[hidden email]> wrote:
>
>   ... WHERE fl_value(body,'contact.address.state') IN ('CA','WA');

This was just a simple example query. In general it's possible to have arbitrary queries that use arbitrary properties. For example it might be comparing a property to a specific value as well as checking the string's length.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Warren Young
In reply to this post by Jens Alfke-2
On Sep 12, 2017, at 11:25 AM, Jens Alfke <[hidden email]> wrote:
>
> SQLite 3.19 doesn’t seem to coalesce identical calls to a deterministic function. For example, in this query, where `fl_value` is a function I’ve registered as SQLITE_DETERMINISTIC:

From my reading of the docs, I don’t see that that is the purpose of SQLITE_DETERMINISTIC:

    https://www.sqlite.org/deterministic.html

The purpose is simply so that the SQLite internals know whether it is safe to use the user-defined function in certain query types.

Still, it *would* be a neat optimization to be able to mark a function as a “function” in the mathematical sense and have SQLite treat it as referentially transparent in contexts like this.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Jens Alfke-2


> On Sep 12, 2017, at 11:09 AM, Warren Young <[hidden email]> wrote:
>
> From my reading of the docs, I don’t see that that is the purpose of SQLITE_DETERMINISTIC:
>
>    https://www.sqlite.org/deterministic.html <https://www.sqlite.org/deterministic.html>

Actually it is. "A deterministic function always gives the same answer when it has the same inputs." That is the definition of a mathematical (also called "pure") function. Such a function call can of course be factored out as a common subexpression.

> The purpose is simply so that the SQLite internals know whether it is safe to use the user-defined function in certain query types.

The reason SQLITE_DETERMINISTIC was added is to allow such functions to be used in indexes and then matched with the same function in a query. That allows indexing things like JSON properties (via the deterministic json_value function.)

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Darren Duncan
On 2017-09-12 11:41 AM, Jens Alfke wrote:

>> On Sep 12, 2017, at 11:09 AM, Warren Young <[hidden email]> wrote:
>>
>> From my reading of the docs, I don’t see that that is the purpose of SQLITE_DETERMINISTIC:
>>
>>    https://www.sqlite.org/deterministic.html <https://www.sqlite.org/deterministic.html>
>
> Actually it is. "A deterministic function always gives the same answer when it has the same inputs." That is the definition of a mathematical (also called "pure") function. Such a function call can of course be factored out as a common subexpression.
>
>> The purpose is simply so that the SQLite internals know whether it is safe to use the user-defined function in certain query types.
>
> The reason SQLITE_DETERMINISTIC was added is to allow such functions to be used in indexes and then matched with the same function in a query. That allows indexing things like JSON properties (via the deterministic json_value function.)

Practically speaking any optimization to reduce actual calls to the
deterministic function would have to be at compile time to rewrite the query to
explicitly keep the result of the function and use it several times, which is
someone users can also do by writing the query differently.

The fact is, any runtime-level smarts to prevent multiple calls to the function
would have to involve creating and maintaining an index of inputs to outputs so
that the DBMS knows whether the function was already called with particular
inputs or not, so that would be an added complexity.

-- Darren Duncan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Jens Alfke-2


> On Sep 12, 2017, at 12:02 PM, Darren Duncan <[hidden email]> wrote:
>
> Practically speaking any optimization to reduce actual calls to the deterministic function would have to be at compile time to rewrite the query to explicitly keep the result of the function and use it several times,

Exactly.

> which is someone users can also do by writing the query differently.

Great — any advice on how to do it? I'm totally willing to do this :) but I'm not sure how. As I said, a WITH clause looks promising, but I don't know if that is purely syntactic sugar, like a macro. (And changing my query generator to factor common calls into WITH clauses would be a nontrivial amount of work, so I would like to get some assurance that it might help, before I try it.)

The CSE optimization has long been standard in traditional compilers, even though the programmer could get the same result by changing their code. (The same is true of many other optimizations.) The benefit is that it lets the developer write simpler, clearer code with less effort.

I realize SQLite doesn't have the kind of industrial-strength query optimizers that other SQL databases have, but (from an outside perspective) this seems like a fairly straightforward optimization. SQLite is already doing some similar tricks to recognize matching sub-expressions when it applies an expression-based index to a query, for example.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Richard Hipp-3
On 9/12/17, Jens Alfke <[hidden email]> wrote:
>
> I realize SQLite doesn't have the kind of industrial-strength query
> optimizers that other SQL databases have, but (from an outside perspective)
> this seems like a fairly straightforward optimization. SQLite is already
> doing some similar tricks to recognize matching sub-expressions when it
> applies an expression-based index to a query, for example.
>

The query planner in SQLite will hold its own against most others.

But we deliberately omit common subexpression elimination (CSE).  This
is because our research shows that out of the millions of queries that
SQLite compiles every second, only a very tiny fraction would actually
benefit from CSE, but checking for CSE is expensive in both memory and
CPU cycles, and all queries would have to pay the extra checking
overhead whether they benefit or not.

A traditional compiler like GCC is free to use as much memory, time,
and code space to implement esoteric optimizations as it wants,
because compilation happens separately from the application and the
build product will be reused many times and so the cost of compilation
is amortized over many executions.  But a query planner in an RDBMS
(which is really just a compiler that translates the SQL programming
language into some low-level representation - byte code in the case of
SQLite) is more constrained because the compilation happens at
application run-time and the number of uses of the build product is
approximately 1.  And so when writing a query planner, one must be
careful in the use of memory, time, and code space devoted to
optimizations.  This is particularly so for SQLite which is an
embedded RDBMS.

Our belief is that CSE would be not worth the extra memory, CPU, and
code space required to implement it since CSE simply does not come up
that often in SQL statements.  If, in the future, we find that people
begin coding more complex SQL statements which will more often benefit
from CSE, then we might revisit this decision.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Warren Young
In reply to this post by Jens Alfke-2
On Sep 12, 2017, at 12:41 PM, Jens Alfke <[hidden email]> wrote:
>
>> On Sep 12, 2017, at 11:09 AM, Warren Young <[hidden email]> wrote:
>>
>> From my reading of the docs, I don’t see that that is the purpose of SQLITE_DETERMINISTIC:
>>
>>   https://www.sqlite.org/deterministic.html <https://www.sqlite.org/deterministic.html>
>
> Actually it is. "A deterministic function always gives the same answer when it has the same inputs." That is the definition of a mathematical (also called "pure") function. Such a function call can of course be factored out as a common subexpression.

All true, but irrelevant is the *purpose* of this constant is not what you want it to be.

As I read the docs, the only purpose of this constant is so that SQLite can give an error if you use a non-deterministic function anywhere that would cause DB corruption.  That is all.

Could it be *extended* to mean what you want?  Of course, but that means you’re asking for a feature, not reporting a bug.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Jens Alfke-2
In reply to this post by Richard Hipp-3


> On Sep 12, 2017, at 12:55 PM, Richard Hipp <[hidden email]> wrote:
>
> But we deliberately omit common subexpression elimination (CSE).  This
> is because our research shows that out of the millions of queries that
> SQLite compiles every second, only a very tiny fraction would actually
> benefit from CSE

OMG, I can see the BuzzFeed headline: "SQLite database engine is secretly sending your queries back to their server for analytics!" ;-)

Seriously, I see your point. But JSON support may have created a new class of queries that do benefit, because every reference to a JSON property is through a function call. So while a traditional query can use the same column name multiple times without incurring overhead, the JSON equivalent repeats a json_extract call multiple times, which does have overhead.

Traditional:
        SELECT * FROM stuff WHERE length(name) = 5 OR name = 'five';
JSON:
        SELECT * FROM stuff WHERE length(json_extract(jsn,'$.name')) = 5 OR json_extract(jsn,'$.name') = 'five';

If CSE optimization is too expensive to do by default, maybe it could be enabled by a flag when preparing the statement?

Unfortunately using a WITH clause doesn't seem to help; as I suspected, it's basically used as a macro that gets expanded. Here's one I tried:
        WITH docs AS (SELECT rowid as id, json_extract(jsn,'$.name') AS name FROM stuff)
        SELECT id FROM docs WHERE length(name) <= 5 OR name = 'five';
The EXPLAIN command shows that this query still makes two identical calls to json_extract per row.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Jens Alfke-2
In reply to this post by Warren Young


> On Sep 12, 2017, at 12:58 PM, Warren Young <[hidden email]> wrote:
>
> Could it be *extended* to mean what you want?  Of course, but that means you’re asking for a feature, not reporting a bug.

I never claimed to be reporting a bug! The subject line refers to this as an "optimization", and in the initial message I said "it would be a noticeable speedup".

I pointed out the SQLITE_DETERMINISTIC flag because it already gives the query optimizer the information it would need to detect that a function call is pure and can safely be factored out.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Darko Volaric
In reply to this post by Jens Alfke-2
You can implement this by using user defined functions to implement row "local variables" or "registers". They're single assignment storage that keeps intermediate results, namely the common subexpressions.

You'd define two functions, something like Get(rowid, name) and Set(rowid, name, value). You call Set with the subexpressions as the last parameter. It doesn't return any value and just stores the value. The Get function returns a previously set value with the given name and is used in the expressions where that subexpression would otherwise appear. Note that you would need to order these so they are evaluated in dependency order, i.e. ensure each name is set before it is got. The rowid parameter is used to detect when the row changes and the local variables are all cleared in readiness for the next row.



> On Sep 12, 2017, at 9:22 PM, Jens Alfke <[hidden email]> wrote:
>
>
>
>> On Sep 12, 2017, at 12:02 PM, Darren Duncan <[hidden email]> wrote:
>>
>> Practically speaking any optimization to reduce actual calls to the deterministic function would have to be at compile time to rewrite the query to explicitly keep the result of the function and use it several times,
>
> Exactly.
>
>> which is someone users can also do by writing the query differently.
>
> Great — any advice on how to do it? I'm totally willing to do this :) but I'm not sure how. As I said, a WITH clause looks promising, but I don't know if that is purely syntactic sugar, like a macro. (And changing my query generator to factor common calls into WITH clauses would be a nontrivial amount of work, so I would like to get some assurance that it might help, before I try it.)
>
> The CSE optimization has long been standard in traditional compilers, even though the programmer could get the same result by changing their code. (The same is true of many other optimizations.) The benefit is that it lets the developer write simpler, clearer code with less effort.
>
> I realize SQLite doesn't have the kind of industrial-strength query optimizers that other SQL databases have, but (from an outside perspective) this seems like a fairly straightforward optimization. SQLite is already doing some similar tricks to recognize matching sub-expressions when it applies an expression-based index to a query, for example.
>
> —Jens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Jens Alfke-2

> On Sep 12, 2017, at 1:41 PM, Darko Volaric <[hidden email]> wrote:
>
> You can implement this by using user defined functions to implement row "local variables" or "registers". They're single assignment storage that keeps intermediate results, namely the common subexpressions.

Thanks! That's a very interesting technique.

> Note that you would need to order these so they are evaluated in dependency order, i.e. ensure each name is set before it is got.

What's a good way to do that? Since SQL is a non-imperative language there isn't much notion of order of operations. I can imagine using a short-circuiting AND operator, but can I be guaranteed that this will always work?

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

raypoker79
In reply to this post by Jens Alfke-2
This is a scam please take me off this email the card has been reported stolen 


Sent via the Samsung Galaxy S7, an AT&T 4G LTE smartphone
-------- Original message --------From: Darko Volaric <[hidden email]> Date: 9/12/17  4:41 PM  (GMT-05:00) To: SQLite mailing list <[hidden email]>, Jens Alfke <[hidden email]> Subject: Re: [sqlite] Common subexpression optimization of deterministic
  functions
You can implement this by using user defined functions to implement row "local variables" or "registers". They're single assignment storage that keeps intermediate results, namely the common subexpressions.

You'd define two functions, something like Get(rowid, name) and Set(rowid, name, value). You call Set with the subexpressions as the last parameter. It doesn't return any value and just stores the value. The Get function returns a previously set value with the given name and is used in the expressions where that subexpression would otherwise appear. Note that you would need to order these so they are evaluated in dependency order, i.e. ensure each name is set before it is got. The rowid parameter is used to detect when the row changes and the local variables are all cleared in readiness for the next row.



> On Sep 12, 2017, at 9:22 PM, Jens Alfke <[hidden email]> wrote:
>
>
>
>> On Sep 12, 2017, at 12:02 PM, Darren Duncan <[hidden email]> wrote:
>>
>> Practically speaking any optimization to reduce actual calls to the deterministic function would have to be at compile time to rewrite the query to explicitly keep the result of the function and use it several times,
>
> Exactly.
>
>> which is someone users can also do by writing the query differently.
>
> Great — any advice on how to do it? I'm totally willing to do this :) but I'm not sure how. As I said, a WITH clause looks promising, but I don't know if that is purely syntactic sugar, like a macro. (And changing my query generator to factor common calls into WITH clauses would be a nontrivial amount of work, so I would like to get some assurance that it might help, before I try it.)
>
> The CSE optimization has long been standard in traditional compilers, even though the programmer could get the same result by changing their code. (The same is true of many other optimizations.) The benefit is that it lets the developer write simpler, clearer code with less effort.
>
> I realize SQLite doesn't have the kind of industrial-strength query optimizers that other SQL databases have, but (from an outside perspective) this seems like a fairly straightforward optimization. SQLite is already doing some similar tricks to recognize matching sub-expressions when it applies an expression-based index to a query, for example.
>
> —Jens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Darko Volaric
In reply to this post by Jens Alfke-2
Yeah that is a tricky bit, especially since the query optimizer might evaluate join expressions in an arbitrary order. A possible approach to this is to work out how to always get a particular expression evaluated first (that may well just be the left-most expression in the WHERE clause) then create a trivial function that takes any number of parameters and always returns true and pass all the Set functions as parameters in left to right dependency order. This should work becuase I believe function parameters are evaluated left to right before the function is called.


> On Sep 12, 2017, at 11:27 PM, Jens Alfke <[hidden email]> wrote:
>
>
>> On Sep 12, 2017, at 1:41 PM, Darko Volaric <[hidden email] <mailto:[hidden email]>> wrote:
>>
>> You can implement this by using user defined functions to implement row "local variables" or "registers". They're single assignment storage that keeps intermediate results, namely the common subexpressions.
>
> Thanks! That's a very interesting technique.
>
>> Note that you would need to order these so they are evaluated in dependency order, i.e. ensure each name is set before it is got.
>
> What's a good way to do that? Since SQL is a non-imperative language there isn't much notion of order of operations. I can imagine using a short-circuiting AND operator, but can I be guaranteed that this will always work?
>
> —Jens

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

Warren Young
In reply to this post by Jens Alfke-2
On Sep 12, 2017, at 2:36 PM, Jens Alfke <[hidden email]> wrote:
>
> On Sep 12, 2017, at 12:58 PM, Warren Young <[hidden email]> wrote:
>>
>> Could it be *extended* to mean what you want?  Of course, but that means you’re asking for a feature, not reporting a bug.
>
> I never claimed to be reporting a bug!

I read your reference to an older version as implying that you thought this was a regression in functionality.  That is, that you thought this change first appeared in 3.19.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Common subexpression optimization of deterministic functions

raypoker79
In reply to this post by Jens Alfke-2
Stop,,I'm not subscribed 


Sent via the Samsung Galaxy S7, an AT&T 4G LTE smartphone
-------- Original message --------From: Warren Young <[hidden email]> Date: 9/12/17  6:19 PM  (GMT-05:00) To: SQLite mailing list <[hidden email]> Subject: Re: [sqlite] Common subexpression optimization of deterministic
  functions
On Sep 12, 2017, at 2:36 PM, Jens Alfke <[hidden email]> wrote:
>
> On Sep 12, 2017, at 12:58 PM, Warren Young <[hidden email]> wrote:
>>
>> Could it be *extended* to mean what you want?  Of course, but that means you’re asking for a feature, not reporting a bug.
>
> I never claimed to be reporting a bug!

I read your reference to an older version as implying that you thought this was a regression in functionality.  That is, that you thought this change first appeared in 3.19.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Common subexpression optimization of deterministic functions

Hick Gunter
In reply to this post by Jens Alfke-2
Try fl_value(...) IN (<list>)

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Jens Alfke
Gesendet: Dienstag, 12. September 2017 19:26
An: SQLite mailing list <[hidden email]>
Betreff: [EXTERNAL] [sqlite] Common subexpression optimization of deterministic functions

SQLite 3.19 doesn’t seem to coalesce identical calls to a deterministic function. For example, in this query, where `fl_value` is a function I’ve registered as SQLITE_DETERMINISTIC:

SELECT key FROM kv_default
WHERE fl_value(body, 'contact.address.state') = 'CA'
   OR fl_value(body, 'contact.address.state') = 'WA'

fl_value gets called twice per row in the table, with the same inputs both times of course. As fl_value is not a cheap function — it’s similar to json_value — it would be a noticeable speedup if it were evaluated only once per row.

Is there a way I can restructure these (automatically generated) queries to do the refactoring explicitly? Sort of like assigning to a temporary variable in an imperative language? It looks like a WITH clause lets me do this syntactically, but I'm not sure if it'll make a difference at runtime.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: [hidden email]

This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Common subexpression optimization of deterministic functions

wmertens
Isn't that what cross join is for? Do a select on a virtual table to
calculate the value and then use that value in the real where clause?

On Wed, Sep 13, 2017, 9:10 AM Hick Gunter <[hidden email]> wrote:

> Try fl_value(...) IN (<list>)
>
> -----Ursprüngliche Nachricht-----
> Von: sqlite-users [mailto:[hidden email]]
> Im Auftrag von Jens Alfke
> Gesendet: Dienstag, 12. September 2017 19:26
> An: SQLite mailing list <[hidden email]>
> Betreff: [EXTERNAL] [sqlite] Common subexpression optimization of
> deterministic functions
>
> SQLite 3.19 doesn’t seem to coalesce identical calls to a deterministic
> function. For example, in this query, where `fl_value` is a function I’ve
> registered as SQLITE_DETERMINISTIC:
>
> SELECT key FROM kv_default
> WHERE fl_value(body, 'contact.address.state') = 'CA'
>    OR fl_value(body, 'contact.address.state') = 'WA'
>
> fl_value gets called twice per row in the table, with the same inputs both
> times of course. As fl_value is not a cheap function — it’s similar to
> json_value — it would be a noticeable speedup if it were evaluated only
> once per row.
>
> Is there a way I can restructure these (automatically generated) queries
> to do the refactoring explicitly? Sort of like assigning to a temporary
> variable in an imperative language? It looks like a WITH clause lets me do
> this syntactically, but I'm not sure if it'll make a difference at runtime.
>
> —Jens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___________________________________________
>  Gunter Hick
> Software Engineer
> Scientific Games International GmbH
> FN 157284 a, HG Wien
> Klitschgasse 2-4, A-1130 Vienna, Austria
> Tel: +43 1 80100 0
> E-Mail: [hidden email]
>
> This communication (including any attachments) is intended for the use of
> the intended recipient(s) only and may contain information that is
> confidential, privileged or legally protected. Any unauthorized use or
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please immediately notify the sender
> by return e-mail message and delete all copies of the original
> communication. Thank you for your cooperation.
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Common subexpression optimization of deterministic functions

Darko Volaric
I think people are missing the point, probably becuase it's not a great example. Consider the following statement:

SELECT funca(slow(10)), funkb(slow(10))

and lets say slow(10) takes an hour to compute, and funka and funkb take almost no time to execute. With common subexpression optimization the statement would take one hour, instead of two, to compute becuase the value of slow(10) would only be calculated once.


> On Sep 14, 2017, at 8:31 AM, Wout Mertens <[hidden email]> wrote:
>
> Isn't that what cross join is for? Do a select on a virtual table to
> calculate the value and then use that value in the real where clause?
>
> On Wed, Sep 13, 2017, 9:10 AM Hick Gunter <[hidden email]> wrote:
>
>> Try fl_value(...) IN (<list>)
>>
>> -----Ursprüngliche Nachricht-----
>> Von: sqlite-users [mailto:[hidden email]]
>> Im Auftrag von Jens Alfke
>> Gesendet: Dienstag, 12. September 2017 19:26
>> An: SQLite mailing list <[hidden email]>
>> Betreff: [EXTERNAL] [sqlite] Common subexpression optimization of
>> deterministic functions
>>
>> SQLite 3.19 doesn’t seem to coalesce identical calls to a deterministic
>> function. For example, in this query, where `fl_value` is a function I’ve
>> registered as SQLITE_DETERMINISTIC:
>>
>> SELECT key FROM kv_default
>> WHERE fl_value(body, 'contact.address.state') = 'CA'
>>   OR fl_value(body, 'contact.address.state') = 'WA'
>>
>> fl_value gets called twice per row in the table, with the same inputs both
>> times of course. As fl_value is not a cheap function — it’s similar to
>> json_value — it would be a noticeable speedup if it were evaluated only
>> once per row.
>>
>> Is there a way I can restructure these (automatically generated) queries
>> to do the refactoring explicitly? Sort of like assigning to a temporary
>> variable in an imperative language? It looks like a WITH clause lets me do
>> this syntactically, but I'm not sure if it'll make a difference at runtime.
>>
>> —Jens
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>> ___________________________________________
>> Gunter Hick
>> Software Engineer
>> Scientific Games International GmbH
>> FN 157284 a, HG Wien
>> Klitschgasse 2-4, A-1130 Vienna, Austria
>> Tel: +43 1 80100 0
>> E-Mail: [hidden email]
>>
>> This communication (including any attachments) is intended for the use of
>> the intended recipient(s) only and may contain information that is
>> confidential, privileged or legally protected. Any unauthorized use or
>> dissemination of this communication is strictly prohibited. If you have
>> received this communication in error, please immediately notify the sender
>> by return e-mail message and delete all copies of the original
>> communication. Thank you for your cooperation.
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
123