Quantcast

json() number value parsing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

json() number value parsing

Rolf Ade


> ./sqlite3
SQLite version 3.19.0 2017-04-07 20:20:08
[...]
sqlite> select json(' { "this" : 000.23 } ');
{"this":000.23}

If I read RFC 7159 (http://www.rfc-editor.org/rfc/rfc7159.txt) correct
this should return: "Error: malformed JSON".

http://sqlite.org/json1.html say, the json1 support "is disabled by
default". If I build from check-out just with ./configure it is
available, I think.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Jens Alfke-2

> On Apr 7, 2017, at 5:26 PM, Rolf Ade <[hidden email]> wrote:
>
> If I read RFC 7159 (http://www.rfc-editor.org/rfc/rfc7159.txt <http://www.rfc-editor.org/rfc/rfc7159.txt>) correct
> this should return: "Error: malformed JSON".

In this case I would go with Postel’s Law, paraphrased as “Be strict in what you write, but lenient in what you read.” I don’t see a point in disallowing something as trivial as redundant leading zeroes.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Olivier Mascia
> Le 9 avr. 2017 à 03:08, Jens Alfke <[hidden email]> a écrit :
>
>> On Apr 7, 2017, at 5:26 PM, Rolf Ade <[hidden email]> wrote:
>> ./sqlite3
>> SQLite version 3.19.0 2017-04-07 20:20:08
>> [...]
>> sqlite> select json(' { "this" : 000.23 } ');
>> {"this":000.23}
>> If I read RFC 7159 (http://www.rfc-editor.org/rfc/rfc7159.txt <http://www.rfc-editor.org/rfc/rfc7159.txt>) correct
>> this should return: "Error: malformed JSON".
>
> In this case I would go with Postel’s Law, paraphrased as “Be strict in what you write, but lenient in what you read.” I don’t see a point in disallowing something as trivial as redundant leading zeroes.

If you'd go with Postal's Law, you would make it so:

sqlite> select json(' { "this" : 000.23 } '); // be lenient in what you read
{"this":0.23} // be strict in what you write

:)

--
Best Regards, Meilleures salutations, Met vriendelijke groeten,
Olivier Mascia, http://integral.software


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Dominique Devienne
On Sun, Apr 9, 2017 at 10:34 AM, Olivier Mascia <[hidden email]> wrote:

> > Le 9 avr. 2017 à 03:08, Jens Alfke <[hidden email]> a écrit :
> >
> >> On Apr 7, 2017, at 5:26 PM, Rolf Ade <[hidden email]> wrote:
> >> ./sqlite3
> >> SQLite version 3.19.0 2017-04-07 20:20:08
> >> [...]
> >> sqlite> select json(' { "this" : 000.23 } ');
> >> {"this":000.23}
> >> If I read RFC 7159 (http://www.rfc-editor.org/rfc/rfc7159.txt <
> http://www.rfc-editor.org/rfc/rfc7159.txt>) correct
> >> this should return: "Error: malformed JSON".
> >
> > In this case I would go with Postel’s Law, paraphrased as “Be strict in
> what you write, but lenient in what you read.” I don’t see a point in
> disallowing something as trivial as redundant leading zeroes.
>
> If you'd go with Postal's Law, you would make it so:
>
> sqlite> select json(' { "this" : 000.23 } ');   // be lenient in what you
> read
> {"this":0.23}                                   // be strict in what you
> write
>

I disagree. There's a spec. it should be followed, by default.

A separate, explicitly enabled "lenient mode" could be added,
for leading zeros, NaNs, C-comments, binary strings, etc...
but it should not be the default.

Otherwise you end up forever having to support non-conformity,
possibly simply because of an oversights.

Plus json_valid() is then lying about checking well-formed'ness. --DD

sqlite> select json_valid('');
0
sqlite> select json_valid('00.1');
1
sqlite> select json_valid('0.1');
1
sqlite> select json_valid('[]');
1
sqlite> select json_valid('[00]');
1
sqlite> select json_valid('[00.00]');
1
sqlite>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Dominique Pellé
Dominique Devienne <[hidden email]> wrote:

> On Sun, Apr 9, 2017 at 10:34 AM, Olivier Mascia <[hidden email]> wrote:
>
>> > Le 9 avr. 2017 à 03:08, Jens Alfke <[hidden email]> a écrit :
>> >
>> >> On Apr 7, 2017, at 5:26 PM, Rolf Ade <[hidden email]> wrote:
>> >> ./sqlite3
>> >> SQLite version 3.19.0 2017-04-07 20:20:08
>> >> [...]
>> >> sqlite> select json(' { "this" : 000.23 } ');
>> >> {"this":000.23}
>> >> If I read RFC 7159 (http://www.rfc-editor.org/rfc/rfc7159.txt <
>> http://www.rfc-editor.org/rfc/rfc7159.txt>) correct
>> >> this should return: "Error: malformed JSON".
>> >
>> > In this case I would go with Postel’s Law, paraphrased as “Be strict in
>> what you write, but lenient in what you read.” I don’t see a point in
>> disallowing something as trivial as redundant leading zeroes.
>>
>> If you'd go with Postal's Law, you would make it so:
>>
>> sqlite> select json(' { "this" : 000.23 } ');   // be lenient in what you
>> read
>> {"this":0.23}                                   // be strict in what you
>> write
>>
>
> I disagree. There's a spec. it should be followed, by default.
>
> A separate, explicitly enabled "lenient mode" could be added,
> for leading zeros, NaNs, C-comments, binary strings, etc...
> but it should not be the default.
>
> Otherwise you end up forever having to support non-conformity,
> possibly simply because of an oversights.
>
> Plus json_valid() is then lying about checking well-formed'ness. --DD
>
> sqlite> select json_valid('');
> 0
> sqlite> select json_valid('00.1');
> 1
> sqlite> select json_valid('0.1');
> 1
> sqlite> select json_valid('[]');
> 1
> sqlite> select json_valid('[00]');
> 1
> sqlite> select json_valid('[00.00]');
> 1


SQLite json could be added to https://github.com/nst/JSONTestSuite
(in the parsers/ directory).

This JSON test suite checks many kinds of invalid and valid JSON
inputs to make sure that they are rejected or accepted as expected
by RFC 7159. It surprisingly finds issues in many JSON parsers.

Dominique
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Richard Hipp-3
On 4/10/17, Dominique Pellé <[hidden email]> wrote:
>
> SQLite json could be added to https://github.com/nst/JSONTestSuite
> (in the parsers/ directory).
>

Thanks for the pointer.  I did not know about that project.

SQLite returns true from json_valid() for the following cases which
should allegedly be false:

  n_multidigit_number_then_00.json
  n_string_unescaped_newline.json
  n_string_unescaped_tab.json

This is the script that I ran to produce the results above:

CREATE TABLE files(name);
.mode csv
.import '| ls *.json' files

-- mal-formed JSON that json_valid shows as correct.
SELECT name FROM files
 WHERE name LIKE 'n_%'
  AND json_valid(readfile(name));

-- well-formed JSON that json_valid shows as incorrect.
SELECT name FROM files
 WHERE name LIKE 'y_%'
  AND NOT json_valid(readfile(name));

-- implementation-defined JSON does not cause crashes
SELECT name FROM files
 WHERE name LIKE 'i_%'
   AND json_valid(readfile(name)) NOT IN (0,1);

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Richard Hipp-3
On 4/10/17, Richard Hipp <[hidden email]> wrote:
> SQLite returns true from json_valid() for the following cases which
> should allegedly be false:
>
>   n_multidigit_number_then_00.json
>   n_string_unescaped_newline.json
>   n_string_unescaped_tab.json
>

The second and third are now fixed on trunk.

The first test case consists of a valid JSON with an extra 0x00
terminator.  As SQLite interprets the 0x00 as an end-of-string marker,
I do not see this one as an error.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Rolf Ade
In reply to this post by Olivier Mascia
Am 04/09/2017 10:34 AM, Olivier Mascia wrote:

>> Le 9 avr. 2017 à 03:08, Jens Alfke <[hidden email]> a écrit :
>>
>>> On Apr 7, 2017, at 5:26 PM, Rolf Ade <[hidden email]> wrote:
>>> ./sqlite3
>>> SQLite version 3.19.0 2017-04-07 20:20:08
>>> [...]
>>> sqlite> select json(' { "this" : 000.23 } ');
>>> {"this":000.23}
>>>
>>> If I read RFC 7159 (http://www.rfc-editor.org/rfc/rfc7159.txt) correct
>>> this should return: "Error: malformed JSON".
>>
>> In this case I would go with Postel’s Law, paraphrased as “Be
>> strict in what you write, but lenient in what you read.” I don’t see
>> a point in disallowing something as trivial as redundant leading
>> zeroes.

Mr. Hipp has already fixed this:
https://www.sqlite.org/src/info/204e72f0080e8f08

If you think, that Postel's law should applied here, then Olivier
already pointed out rightfully:

> If you'd go with Postal's Law, you would make it so:
>
> sqlite> select json(' { "this" : 000.23 } '); // be lenient in what you read
> {"this":0.23} // be strict in what you write

I think, you do your users no good on the long run, if you accept not
recommendation compliant input (without explict request to do that by
the user). After all, JSON isn't a very complex standard and the
specification does not let much room (if ever) to argue if a certain
input string is valid or not. And JSON isn't an internet protocol, but
a data interchange format.

That all said I'm far from being religious about this. Even a "won't
fix" or a "works as designed" would have been OK with me.

This even wasn't a case I god bitten by this in the wild. For another
project I'm currently writing my 'own' JSON parser. To do that, I took
a look at the sqlite JSON parser implementation (just because I knew
it's on my hard disk and Mr. Hipps code is high quality). I stumbled
over this just by studying the sqlite json1.c code and wanted to make
sure, this implemenation detail is known and decided deliberately.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Dominique Devienne
In reply to this post by Richard Hipp-3
On Mon, Apr 10, 2017 at 2:27 PM, Richard Hipp <[hidden email]> wrote:

> On 4/10/17, Richard Hipp <[hidden email]> wrote:
> > SQLite returns true from json_valid() for the following cases which
> > should allegedly be false:
> >
> >   n_multidigit_number_then_00.json
> >   n_string_unescaped_newline.json
> >   n_string_unescaped_tab.json
> >
>
> The second and third are now fixed on trunk.
>
> The first test case consists of a valid JSON with an extra 0x00
> terminator.  As SQLite interprets the 0x00 as an end-of-string marker,
> I do not see this one as an error.


Great! Thanks Richard.

FWIW, looking for whether trailing zeros for the fractional part of reals
was OK,
(seems like they are), I ran into http://json5.org/ which is an interesting
"for-humans" extension to JSON. --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json() number value parsing

Jens Alfke-2

> On Apr 10, 2017, at 6:37 AM, Dominique Devienne <[hidden email]> wrote:
>
> I ran into http://json5.org/ <http://json5.org/> which is an interesting "for-humans" extension to JSON.

I love JSON5. It’s an extension of JSON that adds extra syntax from JavaScript, like single quotes, trailing commas, and unquoted alphanumeric keys. I find it much more readable and much easier to write. A key benefit is that, by using single quotes, you can embed JSON5 in a C/C++/Swift/etc. string literal without having to escape all those pesky double-quotes:

        const char *json  = "{\"foo\": \"bar\"}";
        const char *json5 = "{foo: 'bar'}";

FYI, I wrote a converter in C++ that translates JSON5 to JSON. I use it a lot in unit tests in my code. It's also useful when parsing configuration files that are likely to be human-written or -edited.
        https://github.com/couchbaselabs/fleece/blob/master/Fleece/JSON5.hh
        https://github.com/couchbaselabs/fleece/blob/master/Fleece/JSON5.cc

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json_valid() crash on Windows - was: json() number value parsing

Ralf Junker
In reply to this post by Richard Hipp-3
SQLite on Windows crashes when running this test:

   n_structure_100000_opening_arrays.json

The crash results from a stack overflow because json_valid() is
implemented using a recursive parser. All versions of json1.c up to
current trunk are affected.

Here is a small SQL snippet that also triggers the overflow:

   SELECT json_valid(json) FROM (
     WITH RECURSIVE
       cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt
         WHERE x < 40000) -- Eventually increase this!
     SELECT group_concat('[', '') AS json FROM cnt);

Depending on compiler and available stack, you may need to increase
40000 to a larger number to reproduce the problem. sqlite3.exe 3.18.0 on
Windows 7 ran out of stack at around 35000.

The problem might escape Linux testing because it usually has a much
larger default stack size than Windows.

One solution would be to limit the parser's nesting depth as RFC 7159
allows:

   https://tools.ietf.org/html/rfc7159#section-9

Ralf

On 10.04.2017 13:54, Richard Hipp wrote:

 > SQLite returns true from json_valid() for the following cases which
 > should allegedly be false:
 >
 >    n_multidigit_number_then_00.json
 >    n_string_unescaped_newline.json
 >    n_string_unescaped_tab.json
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: json_valid() crash on Windows - was: json() number value parsing

Dominique Pellé
Ralf Junker <[hidden email]> wrote:

> SQLite on Windows crashes when running this test:
>
>   n_structure_100000_opening_arrays.json
>
> The crash results from a stack overflow because json_valid() is implemented
> using a recursive parser. All versions of json1.c up to current trunk are
> affected.
>
> Here is a small SQL snippet that also triggers the overflow:
>
>   SELECT json_valid(json) FROM (
>     WITH RECURSIVE
>       cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt
>         WHERE x < 40000) -- Eventually increase this!
>     SELECT group_concat('[', '') AS json FROM cnt);
>
> Depending on compiler and available stack, you may need to increase 40000 to
> a larger number to reproduce the problem. sqlite3.exe 3.18.0 on Windows 7
> ran out of stack at around 35000.
>
> The problem might escape Linux testing because it usually has a much larger
> default stack size than Windows.
>
> One solution would be to limit the parser's nesting depth as RFC 7159
> allows:
>
>   https://tools.ietf.org/html/rfc7159#section-9
>
> Ralf

A better solution is to avoid recursion when parsing JSON.

JSON parsing in Vim had the same kind of problem and it was
reworked in vim-8.0.169 to avoid recursion to be able to
parse n_structure_100000_opening_arrays.json.
It can be a big change though.

Dominique
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Loading...