json_group_array() and sorting

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

json_group_array() and sorting

Eric Grange-3
Hi,

Is json_group_array() supposed to honor a sorting clause or not ? (and
concatenation aggregates in general)

I have a query like

select json_group_array(json_object(
   'id', st.id,
   'num', st.numeric_field,
   ...bunch of fields here...
))
from some_table st
...bunch of joins here...
where ...bunch of conditions...
order by st.numeric_field desc
limit 50


but the resulting JSON array is not ordered according to the "order by",
but AFAICT by the st.id field (a primary key)
When not aggregating, the records are in the correct order.

Is it a bug or something expected ?

I can get the proper order when I use a subquery for the joins & filters,
and aggregate in a top level query, but that is
rather more verbose, and I am not sure the ordering being preserved in that
case is not just 'circumstancial' and could
be affected by future SQLite query optimizations.

Thanks!

Eric
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] json_group_array() and sorting

Hick Gunter
I don't recall that any (aggregate) function is concerned at all about the order in which rows are visited. The effect is only visible in non-commutative aggregates (e.g. concatenation).

If you want the arguments presented to an aggregate function in a specific order, then you need to enforce that order, with an order by clause in a subselect if necessary.

If you have an order by clause which is already fulfilled by the visitation order, SQLite will not sort again.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Eric Grange
Gesendet: Dienstag, 08. Jänner 2019 09:17
An: General Discussion of SQLite Database <[hidden email]>
Betreff: [EXTERNAL] [sqlite] json_group_array() and sorting

Hi,

Is json_group_array() supposed to honor a sorting clause or not ? (and concatenation aggregates in general)

I have a query like

select json_group_array(json_object(
   'id', st.id,
   'num', st.numeric_field,
   ...bunch of fields here...
))
from some_table st
...bunch of joins here...
where ...bunch of conditions...
order by st.numeric_field desc
limit 50


but the resulting JSON array is not ordered according to the "order by", but AFAICT by the st.id field (a primary key) When not aggregating, the records are in the correct order.

Is it a bug or something expected ?

I can get the proper order when I use a subquery for the joins & filters, and aggregate in a top level query, but that is rather more verbose, and I am not sure the ordering being preserved in that case is not just 'circumstancial' and could be affected by future SQLite query optimizations.

Thanks!

Eric
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] json_group_array() and sorting

Eric Grange
Thanks.

I think I may have encountered a "real" bug while ordering in a subquery.
I have simplified it in the following exemples:

select json_group_array(o) from (
   select json_object(
      'id', sb.id
   ) o
   from (
      select 1 id, 2 field
      ) sb
)


the json_group_array returns an array of JSON object, ie. [{"id":1}] while
in

select json_group_array(o) from (
   select json_object(
      'id', sb.id
   ) o
   from (
      select 1 id, 2 field
      ) sb
   order by sb.field desc
)

so with an added order by in the subquery, it returns and array of JSON
strings, ie. ["{\"id\":1}"]

In my particular case, I can work around the issue by using group_concat()
rather than json_group_array()

Can someone confirm whether this is a bug ?

Thanks!


Le mar. 8 janv. 2019 à 10:18, Hick Gunter <[hidden email]> a écrit :

> I don't recall that any (aggregate) function is concerned at all about the
> order in which rows are visited. The effect is only visible in
> non-commutative aggregates (e.g. concatenation).
>
> If you want the arguments presented to an aggregate function in a specific
> order, then you need to enforce that order, with an order by clause in a
> subselect if necessary.
>
> If you have an order by clause which is already fulfilled by the
> visitation order, SQLite will not sort again.
>
> -----Ursprüngliche Nachricht-----
> Von: sqlite-users [mailto:[hidden email]]
> Im Auftrag von Eric Grange
> Gesendet: Dienstag, 08. Jänner 2019 09:17
> An: General Discussion of SQLite Database <
> [hidden email]>
> Betreff: [EXTERNAL] [sqlite] json_group_array() and sorting
>
> Hi,
>
> Is json_group_array() supposed to honor a sorting clause or not ? (and
> concatenation aggregates in general)
>
> I have a query like
>
> select json_group_array(json_object(
>    'id', st.id,
>    'num', st.numeric_field,
>    ...bunch of fields here...
> ))
> from some_table st
> ...bunch of joins here...
> where ...bunch of conditions...
> order by st.numeric_field desc
> limit 50
>
>
> but the resulting JSON array is not ordered according to the "order by",
> but AFAICT by the st.id field (a primary key) When not aggregating, the
> records are in the correct order.
>
> Is it a bug or something expected ?
>
> I can get the proper order when I use a subquery for the joins & filters,
> and aggregate in a top level query, but that is rather more verbose, and I
> am not sure the ordering being preserved in that case is not just
> 'circumstancial' and could be affected by future SQLite query optimizations.
>
> Thanks!
>
> Eric
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___________________________________________
>  Gunter Hick | Software Engineer | Scientific Games International GmbH |
> Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O)
> +43 1 80100 - 0
>
> May be privileged. May be confidential. Please delete if not the addressee.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] json_group_array() and sorting

Dominique Devienne
On Tue, Jan 8, 2019 at 10:50 AM Eric Grange <[hidden email]> wrote:

> Thanks.
>
> I think I may have encountered a "real" bug while ordering in a subquery.
> I have simplified it in the following exemples:
>
> select json_group_array(o) from (
>    select json_object(
>       'id', sb.id
>    ) o
>    from (
>       select 1 id, 2 field
>       ) sb
> )
>
>
> the json_group_array returns an array of JSON object, ie. [{"id":1}] while
> in
>
> select json_group_array(o) from (
>    select json_object(
>       'id', sb.id
>    ) o
>    from (
>       select 1 id, 2 field
>       ) sb
>    order by sb.field desc
> )
>
> so with an added order by in the subquery, it returns and array of JSON
> strings, ie. ["{\"id\":1}"]
>
> In my particular case, I can work around the issue by using group_concat()
> rather than json_group_array()
>
> Can someone confirm whether this is a bug ?


My guess is that it works as "designed", even if this is surprising...

I believe that JSON1 leverages value "sub-types" [1], which allow chained
JSON
"documents" to be processed in an optimized fashion (to avoid
internal-representation
to text, and back conversions across JSON1 calls).

But when you add sorting to the mix, SQLite probably decides to "lose" the
subtype
and convert to string for some reasons.

Look at the plans. SQLite may "flatten" the 1st query, so that the JSON1
functions "chain"
correctly, preserving the subtype, while in the 2nd query's plan, the two
functions do not
"chain" anymore, "losing" the subtype.

This is a side-effect of subtypes being a bit of "wart" and not really part
of the type-system proper.
So they are easily lost along the way, in ways which depend on how the
planner "rewrites" the
query, as in your case. Subtypes are still very useful, but more of a
"pragmatic" solution, than
an elegant design, for once in SQLite. IMHO :).

This is just a guess though. DRH will likely shed more light on this.
Thanks, --DD

[1] https://www.sqlite.org/c3ref/value_subtype.html
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] json_group_array() and sorting

Shawn Wagner
In reply to this post by Eric Grange
Try using json_group_array(json(o))

On Tue, Jan 8, 2019, 1:50 AM Eric Grange <[hidden email] wrote:

> Thanks.
>
> I think I may have encountered a "real" bug while ordering in a subquery.
> I have simplified it in the following exemples:
>
> select json_group_array(o) from (
>    select json_object(
>       'id', sb.id
>    ) o
>    from (
>       select 1 id, 2 field
>       ) sb
> )
>
>
> the json_group_array returns an array of JSON object, ie. [{"id":1}] while
> in
>
> select json_group_array(o) from (
>    select json_object(
>       'id', sb.id
>    ) o
>    from (
>       select 1 id, 2 field
>       ) sb
>    order by sb.field desc
> )
>
> so with an added order by in the subquery, it returns and array of JSON
> strings, ie. ["{\"id\":1}"]
>
> In my particular case, I can work around the issue by using group_concat()
> rather than json_group_array()
>
> Can someone confirm whether this is a bug ?
>
> Thanks!
>
>
> Le mar. 8 janv. 2019 à 10:18, Hick Gunter <[hidden email]> a écrit :
>
> > I don't recall that any (aggregate) function is concerned at all about
> the
> > order in which rows are visited. The effect is only visible in
> > non-commutative aggregates (e.g. concatenation).
> >
> > If you want the arguments presented to an aggregate function in a
> specific
> > order, then you need to enforce that order, with an order by clause in a
> > subselect if necessary.
> >
> > If you have an order by clause which is already fulfilled by the
> > visitation order, SQLite will not sort again.
> >
> > -----Ursprüngliche Nachricht-----
> > Von: sqlite-users [mailto:[hidden email]]
> > Im Auftrag von Eric Grange
> > Gesendet: Dienstag, 08. Jänner 2019 09:17
> > An: General Discussion of SQLite Database <
> > [hidden email]>
> > Betreff: [EXTERNAL] [sqlite] json_group_array() and sorting
> >
> > Hi,
> >
> > Is json_group_array() supposed to honor a sorting clause or not ? (and
> > concatenation aggregates in general)
> >
> > I have a query like
> >
> > select json_group_array(json_object(
> >    'id', st.id,
> >    'num', st.numeric_field,
> >    ...bunch of fields here...
> > ))
> > from some_table st
> > ...bunch of joins here...
> > where ...bunch of conditions...
> > order by st.numeric_field desc
> > limit 50
> >
> >
> > but the resulting JSON array is not ordered according to the "order by",
> > but AFAICT by the st.id field (a primary key) When not aggregating, the
> > records are in the correct order.
> >
> > Is it a bug or something expected ?
> >
> > I can get the proper order when I use a subquery for the joins & filters,
> > and aggregate in a top level query, but that is rather more verbose, and
> I
> > am not sure the ordering being preserved in that case is not just
> > 'circumstancial' and could be affected by future SQLite query
> optimizations.
> >
> > Thanks!
> >
> > Eric
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> >
> > ___________________________________________
> >  Gunter Hick | Software Engineer | Scientific Games International GmbH |
> > Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 |
> (O)
> > +43 1 80100 - 0
> >
> > May be privileged. May be confidential. Please delete if not the
> addressee.
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] json_group_array() and sorting

Dominique Devienne
In reply to this post by Dominique Devienne
On Tue, Jan 8, 2019 at 11:04 AM Dominique Devienne <[hidden email]>
wrote:
>
> On Tue, Jan 8, 2019 at 10:50 AM Eric Grange <[hidden email]> wrote:
>>
>> Can someone confirm whether this is a bug ?
>
>
> My guess is that it works as "designed", even if this is surprising...
>
> I believe that JSON1 leverages value "sub-types" [1], which allow chained
JSON
> "documents" to be processed in an optimized fashion (to avoid
internal-representation
> to text, and back conversions across JSON1 calls).
>
> But when you add sorting to the mix, SQLite probably decides to "lose"
the subtype
> and convert to string for some reasons.
>
> Look at the plans. SQLite may "flatten" the 1st query, so that the JSON1
functions "chain"
> correctly, preserving the subtype, while in the 2nd query's plan, the two
functions do not
> "chain" anymore, "losing" the subtype.
>
> This is a side-effect of subtypes being a bit of "wart" and not really
part of the type-system proper.
> So they are easily lost along the way, in ways which depend on how the
planner "rewrites" the
> query, as in your case. Subtypes are still very useful, but more of a
"pragmatic" solution, than
> an elegant design, for once in SQLite. IMHO :).
>
> This is just a guess though. DRH will likely shed more light on this.
Thanks, --DD
>
> [1] https://www.sqlite.org/c3ref/value_subtype.html

See also [2], which states "values [...] are transient and ephemeral.
[...]. The pointers will not survive sorting".

The pointer-passing APIs are different from the subtype one, but as [3]
states, they
both addressed the same issue, and likely obey similar rules, linked to
sqlite3_value in general.

This is IMHO what's going on. And I think Shawn's work-around will work :).
--DD

[2]
https://www.sqlite.org/bindptr.html#restrictions_on_the_use_of_pointer_values
[3] https://www.sqlite.org/bindptr.html#preventing_forged_pointers
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] json_group_array() and sorting

Eric Grange
Shawn's json_group_array(json(o)) works indeed, but it's also 30% slower in
my case than using

'[' || ifnull(group_concat(o, ','), '') || ']'


which is however more case specific and less obvious.

Would be nice to see the subtype passing be improved, as otherwise query
planner improvements
could end up wrecking existing queries where the subtype currently survives.

Eric


Le mar. 8 janv. 2019 à 11:41, Dominique Devienne <[hidden email]> a
écrit :

> On Tue, Jan 8, 2019 at 11:04 AM Dominique Devienne <[hidden email]>
> wrote:
> >
> > On Tue, Jan 8, 2019 at 10:50 AM Eric Grange <[hidden email]> wrote:
> >>
> >> Can someone confirm whether this is a bug ?
> >
> >
> > My guess is that it works as "designed", even if this is surprising...
> >
> > I believe that JSON1 leverages value "sub-types" [1], which allow chained
> JSON
> > "documents" to be processed in an optimized fashion (to avoid
> internal-representation
> > to text, and back conversions across JSON1 calls).
> >
> > But when you add sorting to the mix, SQLite probably decides to "lose"
> the subtype
> > and convert to string for some reasons.
> >
> > Look at the plans. SQLite may "flatten" the 1st query, so that the JSON1
> functions "chain"
> > correctly, preserving the subtype, while in the 2nd query's plan, the two
> functions do not
> > "chain" anymore, "losing" the subtype.
> >
> > This is a side-effect of subtypes being a bit of "wart" and not really
> part of the type-system proper.
> > So they are easily lost along the way, in ways which depend on how the
> planner "rewrites" the
> > query, as in your case. Subtypes are still very useful, but more of a
> "pragmatic" solution, than
> > an elegant design, for once in SQLite. IMHO :).
> >
> > This is just a guess though. DRH will likely shed more light on this.
> Thanks, --DD
> >
> > [1] https://www.sqlite.org/c3ref/value_subtype.html
>
> See also [2], which states "values [...] are transient and ephemeral.
> [...]. The pointers will not survive sorting".
>
> The pointer-passing APIs are different from the subtype one, but as [3]
> states, they
> both addressed the same issue, and likely obey similar rules, linked to
> sqlite3_value in general.
>
> This is IMHO what's going on. And I think Shawn's work-around will work :).
> --DD
>
> [2]
>
> https://www.sqlite.org/bindptr.html#restrictions_on_the_use_of_pointer_values
> [3] https://www.sqlite.org/bindptr.html#preventing_forged_pointers
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users