returning multiple rows from custom functions

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

returning multiple rows from custom functions

Andrew McDermott

Hi,

Is it possible to get an aggregate function to return more than one row?

For example, I'm currently computing a histogram in application code for
a moderately sized table (7+ million rows), but I'm wondering whether it
would be quicker to get the results computed via a custom SQLite
function.  I'm expecting it to be quicker because the current
implementation traverses the JNI boundary for each row in the result set
whereas a custom function wouldn't need to do this.

--
andy

Reply | Threaded
Open this post in threaded view
|

Re: returning multiple rows from custom functions

Nathan Kurz
On Thu, Dec 15, 2005 at 09:17:48PM +0000, Andrew McDermott wrote:
> For example, I'm currently computing a histogram in application code for
> a moderately sized table (7+ million rows), but I'm wondering whether it
> would be quicker to get the results computed via a custom SQLite
> function.  I'm expecting it to be quicker because the current
> implementation traverses the JNI boundary for each row in the result set
> whereas a custom function wouldn't need to do this.

No, it is not currently possible.  The way the VDBE is set up, all
functions return only a single value.  But do you need multiple rows,
or multiple values?  For a histogram, would multiple values suffice?

If so you can fake it.  With aggregate functions it's possible to
compute the histogram, and then return it in some special form.  You
could for example define a histogram function that returns a comma
separated list of text, or you could have it return an array as a blob.

> select histogram(column) from table;
> 1,4,1

Or you could split the histogram function into pieces, one for each
bin, and write a simple aggregate function that just counts the
occurences of one particular value:

> select number(column,1), number(column,2), number(column,3) from table;
> 1,4,1

Both of these would be quite easy to do.  That said, I'd really love
if it were possible to return either multiple values or multiple rows
from a user defined function.  It would make my life much easier,
since then I could work with the results at the SQL level rather than
the application level.  As it is, I'm currently I'm doing stuff like
writing functions that cache the results and return a handle, then
writing accessor functions that lookup the result via that handle.

My impression is that multiple values would be difficult, as the VDBE
parser would have to know in advance what columns would be returned by
the function to be able to use the results internally.  Multiple rows
(of a single value) seem like they would be a fairly simple addition,
as one would only need to specify that a function might return
multiple rows, and not how many or their type.

--nate
Reply | Threaded
Open this post in threaded view
|

Re: returning multiple rows from custom functions

Andrew McDermott

Nathan Kurz <[hidden email]> writes:

> On Thu, Dec 15, 2005 at 09:17:48PM +0000, Andrew McDermott wrote:
>> For example, I'm currently computing a histogram in application code for
>> a moderately sized table (7+ million rows), but I'm wondering whether it
>> would be quicker to get the results computed via a custom SQLite
>> function.  I'm expecting it to be quicker because the current
>> implementation traverses the JNI boundary for each row in the result set
>> whereas a custom function wouldn't need to do this.
>
> No, it is not currently possible.  The way the VDBE is set up, all
> functions return only a single value.  But do you need multiple rows,
> or multiple values?  For a histogram, would multiple values suffice?

Thanks for this suggestion.  I have done this and it works just fine.

>
> If so you can fake it.  With aggregate functions it's possible to
> compute the histogram, and then return it in some special form.  You
> could for example define a histogram function that returns a comma
> separated list of text, or you could have it return an array as a blob.
>
>> select histogram(column) from table;
>> 1,4,1
>
> Or you could split the histogram function into pieces, one for each
> bin, and write a simple aggregate function that just counts the
> occurences of one particular value:
>
>> select number(column,1), number(column,2), number(column,3) from table;
>> 1,4,1
>
> Both of these would be quite easy to do.  That said, I'd really love
> if it were possible to return either multiple values or multiple rows
> from a user defined function.  It would make my life much easier,
> since then I could work with the results at the SQL level rather than
> the application level.  As it is, I'm currently I'm doing stuff like
> writing functions that cache the results and return a handle, then
> writing accessor functions that lookup the result via that handle.
>
> My impression is that multiple values would be difficult, as the VDBE
> parser would have to know in advance what columns would be returned by
> the function to be able to use the results internally.  Multiple rows
> (of a single value) seem like they would be a fairly simple addition,
> as one would only need to specify that a function might return
> multiple rows, and not how many or their type.
>
> --nate
>

--
andy