Re: Schema-less JSON SQLite DB?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Hayden Livingston
 Is there a concept of a schema-less JSON SQLite DB?

 My reason is simple: versioning. We have lot of business metrics that
 get updated let's say once a month, and we need to be agile to get
 them. Right now, we just put the version in the SQLite file, and then
 make sure no queries cross the boundaries.

 Secondly, we have requirements for slightly hierarchal data, i.e.
 mostly row form, but then some guy wants to put an object.

 What's the SQLite community heading towards if at all?

 Do others have experiences and requirements similar to this?
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Roger Binns
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/13/2015 05:43 PM, Hayden Livingston wrote:
> Is there a concept of a schema-less JSON SQLite DB?

What exactly is it you want?  A schema-less database?  A JSON
database? Using SQLite to store JSON?  (It is unclear if you mean
something like SQLite, or SQLite itself).  Something small and light?
 A library not a server?

Be aware that you can dynamically update the SQLite schema, and that
it takes no time to do so.  Another database I won't name locks the
database and rewrites each record.  This can take hours or even days.
 SQLite just updates the schema table, and fills in defaults for
records written before the schema change, on reads.  I wrote some code
the other day that takes JSON and does the schema manipulation,
although it is annoying.  (It would be really great if SQLite
supported arrays like Postgres does.)

You can also store arbitrary JSON in a column.  You won't be able to
query it effectively, but you can duplicate values into columns you
can do queries and indexes on.

> My reason is simple: versioning. We have lot of business metrics
> that get updated let's say once a month, and we need to be agile to
> get them. Right now, we just put the version in the SQLite file,
> and then make sure no queries cross the boundaries.

You can also use multiple databases, and attach them.  For example if
you operate on a month by month basis, then you can put each month's
data in a separate SQLite file, then attach last month's as
'lastmonth', and use in queries (eg SELECT blah blah WHERE bar.foo >
lastmonth.bar.foo)

> Do others have experiences and requirements similar to this?

All the time.  I use SQLite when I don't need networked access, a
running database server, and need less things that can go wrong.
Mapping JSON into this is painful but possible.

When I can use a database server, I prefer Mongodb as it is very good
at arbitrary JSON in, the same arbitrary JSON back out.  It is
especially pleasant that the query syntax has the same shape as the
underlying JSON data.  Also JSON maps trivially to Python which I use
the most.  (Note however that Mongodb does have some issues, but so
does much other software out there.  Production use does require
effort & planning as does other software.)

In the longer term Postgres is getting increasingly better JSON
support.  Hopefully it will give Mongodb a run for its money soon.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlWkbfsACgkQmOOfHg372QTFiwCgzzf5fEzfJdWn84bk0e1fusGE
qeAAnAhR+sHb39Gsha2Owq3SXsdZVRex
=9jcT
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Hayden Livingston
We use SQLite as a library and we don't want to migrate to a server situation.

So it seems using the month-to-month approach you outline is
reasonable and not outrageous. Good to know.

Thanks for the tip on schema modifications which get automatically
upgraded on read. This is a great feature. I think I might try that
out for my application.

Does your code also map object hierarchies in json? What general
format do you use? Each object type gets stored in a separate table?

On Mon, Jul 13, 2015 at 7:03 PM, Roger Binns <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/13/2015 05:43 PM, Hayden Livingston wrote:
>> Is there a concept of a schema-less JSON SQLite DB?
>
> What exactly is it you want?  A schema-less database?  A JSON
> database? Using SQLite to store JSON?  (It is unclear if you mean
> something like SQLite, or SQLite itself).  Something small and light?
>  A library not a server?
>
> Be aware that you can dynamically update the SQLite schema, and that
> it takes no time to do so.  Another database I won't name locks the
> database and rewrites each record.  This can take hours or even days.
>  SQLite just updates the schema table, and fills in defaults for
> records written before the schema change, on reads.  I wrote some code
> the other day that takes JSON and does the schema manipulation,
> although it is annoying.  (It would be really great if SQLite
> supported arrays like Postgres does.)
>
> You can also store arbitrary JSON in a column.  You won't be able to
> query it effectively, but you can duplicate values into columns you
> can do queries and indexes on.
>
>> My reason is simple: versioning. We have lot of business metrics
>> that get updated let's say once a month, and we need to be agile to
>> get them. Right now, we just put the version in the SQLite file,
>> and then make sure no queries cross the boundaries.
>
> You can also use multiple databases, and attach them.  For example if
> you operate on a month by month basis, then you can put each month's
> data in a separate SQLite file, then attach last month's as
> 'lastmonth', and use in queries (eg SELECT blah blah WHERE bar.foo >
> lastmonth.bar.foo)
>
>> Do others have experiences and requirements similar to this?
>
> All the time.  I use SQLite when I don't need networked access, a
> running database server, and need less things that can go wrong.
> Mapping JSON into this is painful but possible.
>
> When I can use a database server, I prefer Mongodb as it is very good
> at arbitrary JSON in, the same arbitrary JSON back out.  It is
> especially pleasant that the query syntax has the same shape as the
> underlying JSON data.  Also JSON maps trivially to Python which I use
> the most.  (Note however that Mongodb does have some issues, but so
> does much other software out there.  Production use does require
> effort & planning as does other software.)
>
> In the longer term Postgres is getting increasingly better JSON
> support.  Hopefully it will give Mongodb a run for its money soon.
>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iEYEARECAAYFAlWkbfsACgkQmOOfHg372QTFiwCgzzf5fEzfJdWn84bk0e1fusGE
> qeAAnAhR+sHb39Gsha2Owq3SXsdZVRex
> =9jcT
> -----END PGP SIGNATURE-----
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Roger Binns
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/13/2015 08:00 PM, Hayden Livingston wrote:
> Does your code also map object hierarchies in json?

Yes, but thankfully I don't have much of them.  Essentially the top
level of the object has a unique id (SQLite allocated), and then other
tables are used to join zero or more child objects to the top level.

> What general format do you use?

Ultimately I use Python dictionaries which are supersets of JSON
objects.  Some (ugly) code can convert both ways

> Each object type gets stored in a separate table?

In my case yes but that is because the underlying data represents
known entities and was actually originally in Postgres and then
exported to hideous inconsistent XML which I then convert/denormalise
back into JSON.

Do remember that SQLite does not require you to specify a type for
each column, nor does it care about the values in a column being
different types between rows.  That means I don't have to worry about
types, only the big picture top level of something being an object, a
list, or a scalar.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlWkg0MACgkQmOOfHg372QRTbQCfRobbh4uU1Eqf0SD9LJxABFYj
iv8AoKUvXNQEgGVvmPiZ/tQgHyU7Q0yL
=S7AM
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Darren Duncan
In reply to this post by Hayden Livingston
Have a look at PostgreSQL 9.4 as well and its new JSONB data type.  Gives you
the goods of relational and hierarchical databases in one place, including the
querying and indexing. -- Darren Duncan

On 2015-07-13 5:43 PM, Hayden Livingston wrote:

>   Is there a concept of a schema-less JSON SQLite DB?
>
>   My reason is simple: versioning. We have lot of business metrics that
>   get updated let's say once a month, and we need to be agile to get
>   them. Right now, we just put the version in the SQLite file, and then
>   make sure no queries cross the boundaries.
>
>   Secondly, we have requirements for slightly hierarchal data, i.e.
>   mostly row form, but then some guy wants to put an object.
>
>   What's the SQLite community heading towards if at all?
>
>   Do others have experiences and requirements similar to this?

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Hayden Livingston
In reply to this post by Roger Binns
Wow, this is the killer feature no-one told me about. I can get rid of
types? How does this work under-the-hood?

SELECT * FROM TABLE WHERE FooColumn > 50

And I've stored "Something" in that column in some row. What's the behavior?

I don't want to take up your time, so if there's a document I can read
about it'd be great.

On Mon, Jul 13, 2015 at 8:34 PM, Roger Binns <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/13/2015 08:00 PM, Hayden Livingston wrote:
>> Does your code also map object hierarchies in json?
>
> Yes, but thankfully I don't have much of them.  Essentially the top
> level of the object has a unique id (SQLite allocated), and then other
> tables are used to join zero or more child objects to the top level.
>
>> What general format do you use?
>
> Ultimately I use Python dictionaries which are supersets of JSON
> objects.  Some (ugly) code can convert both ways
>
>> Each object type gets stored in a separate table?
>
> In my case yes but that is because the underlying data represents
> known entities and was actually originally in Postgres and then
> exported to hideous inconsistent XML which I then convert/denormalise
> back into JSON.
>
> Do remember that SQLite does not require you to specify a type for
> each column, nor does it care about the values in a column being
> different types between rows.  That means I don't have to worry about
> types, only the big picture top level of something being an object, a
> list, or a scalar.
>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iEYEARECAAYFAlWkg0MACgkQmOOfHg372QRTbQCfRobbh4uU1Eqf0SD9LJxABFYj
> iv8AoKUvXNQEgGVvmPiZ/tQgHyU7Q0yL
> =S7AM
> -----END PGP SIGNATURE-----
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Hayden Livingston
In reply to this post by Darren Duncan
that does indeed look interesting. Embedded DB is the only problem here.

On Mon, Jul 13, 2015 at 11:01 PM, Darren Duncan <[hidden email]> wrote:

> Have a look at PostgreSQL 9.4 as well and its new JSONB data type.  Gives
> you the goods of relational and hierarchical databases in one place,
> including the querying and indexing. -- Darren Duncan
>
>
> On 2015-07-13 5:43 PM, Hayden Livingston wrote:
>>
>>   Is there a concept of a schema-less JSON SQLite DB?
>>
>>   My reason is simple: versioning. We have lot of business metrics that
>>   get updated let's say once a month, and we need to be agile to get
>>   them. Right now, we just put the version in the SQLite file, and then
>>   make sure no queries cross the boundaries.
>>
>>   Secondly, we have requirements for slightly hierarchal data, i.e.
>>   mostly row form, but then some guy wants to put an object.
>>
>>   What's the SQLite community heading towards if at all?
>>
>>   Do others have experiences and requirements similar to this?
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

RE: Schema-less JSON SQLite DB?

Deon Brewis
This post has NOT been accepted by the mailing list yet.
In reply to this post by Hayden Livingston

From the SQLITE documentation:

 

3.0 Comparison Expressions:

·         An INTEGER or REAL value is less than any TEXT or BLOB value. When an INTEGER or REAL is compared to another INTEGER or REAL, a numerical comparison is performed.

- Deon

 

From: Hayden Livingston [via SQLite] [mailto:[hidden email]]
Sent: Tuesday, July 14, 2015 10:22 AM
To: Deon Brewis
Subject: Re: Schema-less JSON SQLite DB?

 

Wow, this is the killer feature no-one told me about. I can get rid of
types? How does this work under-the-hood?

SELECT * FROM TABLE WHERE FooColumn > 50

And I've stored "Something" in that column in some row. What's the behavior?

I don't want to take up your time, so if there's a document I can read
about it'd be great.

On Mon, Jul 13, 2015 at 8:34 PM, Roger Binns <[hidden email]> wrote:


> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/13/2015 08:00 PM, Hayden Livingston wrote:
>> Does your code also map object hierarchies in json?
>
> Yes, but thankfully I don't have much of them.  Essentially the top
> level of the object has a unique id (SQLite allocated), and then other
> tables are used to join zero or more child objects to the top level.
>
>> What general format do you use?
>
> Ultimately I use Python dictionaries which are supersets of JSON
> objects.  Some (ugly) code can convert both ways
>
>> Each object type gets stored in a separate table?
>
> In my case yes but that is because the underlying data represents
> known entities and was actually originally in Postgres and then
> exported to hideous inconsistent XML which I then convert/denormalise
> back into JSON.
>
> Do remember that SQLite does not require you to specify a type for
> each column, nor does it care about the values in a column being
> different types between rows.  That means I don't have to worry about
> types, only the big picture top level of something being an object, a
> list, or a scalar.
>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iEYEARECAAYFAlWkg0MACgkQmOOfHg372QRTbQCfRobbh4uU1Eqf0SD9LJxABFYj
> iv8AoKUvXNQEgGVvmPiZ/tQgHyU7Q0yL
> =S7AM
> -----END PGP SIGNATURE-----
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________

sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


If you reply to this email, your message will be added to the discussion below:

http://sqlite.1065341.n5.nabble.com/Re-Schema-less-JSON-SQLite-DB-tp83095p83106.html

To start a new topic under SQLite, email [hidden email]
To unsubscribe from SQLite, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Simon Slavin-3
In reply to this post by Hayden Livingston

On 14 Jul 2015, at 6:22pm, Hayden Livingston <[hidden email]> wrote:

> Wow, this is the killer feature no-one told me about. I can get rid of
> types? How does this work under-the-hood?
>
> SELECT * FROM TABLE WHERE FooColumn > 50
>
> And I've stored "Something" in that column in some row. What's the behavior?

SQLite uses column affinities not column types.  You can declare a column with a type, but store any type of value in any row of that column. Once stored you can discover what type was stored using something like

SELECT typeof(foo),foo FROM myTable

If you insist on sorting column foo when there are different types of value stored in it, you will get a consistent but not especially useful sort order.

There is a particular restriction which is the result of SQLite trying to be useful.  As an example, if you define a column as REAL and store the string '123.456' in it, SQLite will automatically convert it to a number before storing it, on the assumption that you will want '123.456' to be sorted after '12.345'.  So if you intend to save many different types in a column, it may be best to define it as having an affinity of BLOB.

For more information read about affinity in section 2.0 of

<https://www.sqlite.org/datatype3.html>

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Kees Nuyt
In reply to this post by Hayden Livingston
On Tue, 14 Jul 2015 10:22:02 -0700, Hayden Livingston
<[hidden email]> wrote:

> Wow, this is the killer feature no-one told me about.
> I can get rid of types? How does this work under-the-hood?
>
> SELECT * FROM TABLE WHERE FooColumn > 50
>
> And I've stored "Something" in that column in some row.
> What's the behavior?

That's easy to test by yourself.
By the way, you can enforce datatype to some extend with
constraints:
CREATE TABLE T1 (
  id INTEGER PRIMARY KEY NOT NULL
, anint INTEGER
  CONSTRAINT int_anint CHECK (typeof(anint) == 'integer')
);
INSERT INTO T1 (anint) VALUES (1);
INSERT INTO T1 (anint) VALUES ('two');


> I don't want to take up your time, so if there's a document I can read
> about it'd be great.

https://sqlite.org/datatype3.html 

All docs:
https://sqlite.org/docs.html


--
Regards, Cordialement, Groet,

Kees Nuyt

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Hick Gunter
In reply to this post by Hayden Livingston
The record sort order is:

NULLs
Numeric by value
Text by collating function
Blob by memcmp order

So your result set will contain all rows having a FooColumn with numeric value greater than 50, a text value or a blob value.

-----Ursprüngliche Nachricht-----
Von: Hayden Livingston [mailto:[hidden email]]
Gesendet: Dienstag, 14. Juli 2015 19:22
An: General Discussion of SQLite Database
Betreff: Re: [sqlite] Schema-less JSON SQLite DB?

Wow, this is the killer feature no-one told me about. I can get rid of types? How does this work under-the-hood?

SELECT * FROM TABLE WHERE FooColumn > 50

And I've stored "Something" in that column in some row. What's the behavior?

I don't want to take up your time, so if there's a document I can read about it'd be great.

On Mon, Jul 13, 2015 at 8:34 PM, Roger Binns <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/13/2015 08:00 PM, Hayden Livingston wrote:
>> Does your code also map object hierarchies in json?
>
> Yes, but thankfully I don't have much of them.  Essentially the top
> level of the object has a unique id (SQLite allocated), and then other
> tables are used to join zero or more child objects to the top level.
>
>> What general format do you use?
>
> Ultimately I use Python dictionaries which are supersets of JSON
> objects.  Some (ugly) code can convert both ways
>
>> Each object type gets stored in a separate table?
>
> In my case yes but that is because the underlying data represents
> known entities and was actually originally in Postgres and then
> exported to hideous inconsistent XML which I then convert/denormalise
> back into JSON.
>
> Do remember that SQLite does not require you to specify a type for
> each column, nor does it care about the values in a column being
> different types between rows.  That means I don't have to worry about
> types, only the big picture top level of something being an object, a
> list, or a scalar.
>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iEYEARECAAYFAlWkg0MACgkQmOOfHg372QRTbQCfRobbh4uU1Eqf0SD9LJxABFYj
> iv8AoKUvXNQEgGVvmPiZ/tQgHyU7Q0yL
> =S7AM
> -----END PGP SIGNATURE-----
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: [hidden email]

This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

decalek
In reply to this post by Hayden Livingston
Hi Hayden,

On 14.07.2015 03:43, Hayden Livingston wrote:

>   Is there a concept of a schema-less JSON SQLite DB?
>
>   My reason is simple: versioning. We have lot of business metrics that
>   get updated let's say once a month, and we need to be agile to get
>   them. Right now, we just put the version in the SQLite file, and then
>   make sure no queries cross the boundaries.
>
>   Secondly, we have requirements for slightly hierarchal data, i.e.
>   mostly row form, but then some guy wants to put an object.
>
>   What's the SQLite community heading towards if at all?

Given the number of the threads in the list, it seems a
lot of people want hierarchical data in SQLite :-)

>
>   Do others have experiences and requirements similar to this?

My experience:

Few years ago, during small project about translating set of
procedures from one to another SQL dialect, we initially
tried to use XML database for the task, but finally become
to solution to manipulate the parsed procedures in SQLite
using a schema with following simplified core:

doc(d, name)
   node(dn, d, x, y, type, value)
     attribute(dn, type, value)
     reference(dn, type, ref_dn)

[Where (x, y) are equal to (row, column) when one dump the
tree to text file with indent = 1]

This schema is a variation of the BaseX encoding schema:

http://docs.basex.org/wiki/Node_Storage

where:
   - we used y (the depth of the node) instead of DIS
     (distance to the parent)
   - we omitted SIZ (size of the subtree) - it is easily
     calculable (next(on same y).x - x).
   - we used single integer .type as replacement of NS, KIND, Tag
     name + expected scalar type
   - we split the scalar attributes (XML attributes) in
     separate table - attribute and moved the first (only
     in our case) text() value to node.value.
   - we added reference (links between nodes)

Actually there were other modifications like e.g.
node_range_offset(dn_from, dn_to, offset) table, which we
applied for faking insertions and deletions of whole subtrees
(logical node.x was function of that table and stored
node.x).

Sometime next year I am planning to revive that experiment,
this time for storing/generating configuration data (for
Linux configuration parsed with the augeas tool). Now with the
presence of the powerful CTE feature in SQLite the things
may be a way more easier.

Meantime (given that you have shared with the list that your
project is a business oriented one), I would like to propose
to you, to initiate a sponsorship and a team including
interested list members for digging the subject further.

The team could consolidate the different approach ideas,
existing open source components and discuss what will be
most appropriate and light as implementation resources for
use cases like yours + eventually report to the the SQLite
core team which additional APIs would make the implementation
more efficient.

Kind regards,
Alek

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

J Decker
On Wed, Jul 15, 2015 at 2:14 AM, Alek Paunov <[hidden email]> wrote:

> Hi Hayden,
>
> On 14.07.2015 03:43, Hayden Livingston wrote:
>
>>   Is there a concept of a schema-less JSON SQLite DB?
>>
>>   My reason is simple: versioning. We have lot of business metrics that
>>   get updated let's say once a month, and we need to be agile to get
>>   them. Right now, we just put the version in the SQLite file, and then
>>   make sure no queries cross the boundaries.
>>
>>   Secondly, we have requirements for slightly hierarchal data, i.e.
>>   mostly row form, but then some guy wants to put an object.
>>
>>   What's the SQLite community heading towards if at all?
>>
>
> Given the number of the threads in the list, it seems a
> lot of people want hierarchical data in SQLite :-)
>
>
>>   Do others have experiences and requirements similar to this?
>>
>
> My experience:
>
>
I have an option map stored hierarchical.  I would tackle storing something
like XML similarly I suppose where I'd have...

    node( n, d, parent_node, node_order, name )
       value( n, value )

(for XML...  doc( d, name, n ) )
    element( n, parent_element, node_order, name )
       attributes( n, name, value )
(where d, n are primary keys)

I didn't have separate maps (like separate xml docs), I have a 'root' which
is ID 0 (GUID 0), and for each value under that I grab where parent is the
node and the name is what I'm looking for... or enumerating all is order by
node_order and parent_id.  For document support, the document would store
the root node ID to prevent replicating the doc ID in all nodes.  The
treeview only queries on the branches you expand so it's not a large query
even with lots of options.  Other developers in Access tried to use it but
ended up sucking in the entire tree and reparsing it with normalized names
that were the full path instead of treating it as a hierarchy.

The other place I used a similar structure was saving a neural network
(really a wiring diagram of nodes and connections)

I don't know much about RECURSE operator that's available... like to select
'option, program, display, render_interface' which is a 4 deep query...

But I did find that reversing the direction of the pointer made it work
pretty well... just not with SQL stored procedures, instead relying on code
support external to the database...  (reversing being that normally you
have node->node->node ... where the storage is actually more like
node<-node<-node  )




> Few years ago, during small project about translating set of
> procedures from one to another SQL dialect, we initially
> tried to use XML database for the task, but finally become
> to solution to manipulate the parsed procedures in SQLite
> using a schema with following simplified core:
>
> doc(d, name)
>   node(dn, d, x, y, type, value)
>     attribute(dn, type, value)
>     reference(dn, type, ref_dn)
>
> [Where (x, y) are equal to (row, column) when one dump the
> tree to text file with indent = 1]
>
> This schema is a variation of the BaseX encoding schema:
>
> http://docs.basex.org/wiki/Node_Storage
>
> where:
>   - we used y (the depth of the node) instead of DIS
>     (distance to the parent)
>   - we omitted SIZ (size of the subtree) - it is easily
>     calculable (next(on same y).x - x).
>   - we used single integer .type as replacement of NS, KIND, Tag
>     name + expected scalar type
>   - we split the scalar attributes (XML attributes) in
>     separate table - attribute and moved the first (only
>     in our case) text() value to node.value.
>   - we added reference (links between nodes)
>
> Actually there were other modifications like e.g.
> node_range_offset(dn_from, dn_to, offset) table, which we
> applied for faking insertions and deletions of whole subtrees
> (logical node.x was function of that table and stored
> node.x).
>
> Sometime next year I am planning to revive that experiment,
> this time for storing/generating configuration data (for
> Linux configuration parsed with the augeas tool). Now with the
> presence of the powerful CTE feature in SQLite the things
> may be a way more easier.
>
> Meantime (given that you have shared with the list that your
> project is a business oriented one), I would like to propose
> to you, to initiate a sponsorship and a team including
> interested list members for digging the subject further.
>
> The team could consolidate the different approach ideas,
> existing open source components and discuss what will be
> most appropriate and light as implementation resources for
> use cases like yours + eventually report to the the SQLite
> core team which additional APIs would make the implementation
> more efficient.
>
> Kind regards,
> Alek
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Eric Rubin-Smith
In reply to this post by decalek
Given the number of the threads in the list, it seems a

> lot of people want hierarchical data in SQLite :-)
>

Throwing in another $.02 here, my company had a need to ingest JSON of
various formats into SQLite databases in low-power embedded devices.  We
wrote the utility program described here:
https://www.aterlo.com/wp-content/uploads/2015/04/json2sqlite.pdf

The program works well as long as your JSON's depth is bounded. I would not
claim that it supports schemaless data models, but we would love to hear
suggestions for design improvements.

We would be happy to clean this up for open source publication if there is
enough interest.

Eric
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Sergej Jurečko
In reply to this post by Hayden Livingston
An idea I’ve had a while ago was to implement functions for json documents
(using sqlite3_create_function_v2)

Json would be stored in a text column. You would need to define 2 functions:
- docget(document,key1,subval1,subval2,…)
- docset(document,key1,subval1,subval2,..,val)

Last parameter of docset is value you wish to set.

So for instance one would write
INSERT INTO mytable VALUES (1,’{key : 10, subdoc : {subkey : “a"}}’);
SELECT id,doc FROM mytable WHERE docget(doc,”key") > 10;
SELECT id,doc FROM mytable WHERE docget(doc,”key”,”subdoc”,”subkey”) = ‘a’;
UPDATE doc FROM mytable WHERE id=1 SET docset(doc,”key”,12);

One could even implement indexes on keys within documents using additional
tables.

What do you guys think? Is it stupid, could it be improved?


Sergej
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Roger Binns
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/15/2015 08:22 AM, Sergej Jurečko wrote:
> What do you guys think? Is it stupid, could it be improved?

I recommend looking at Mongodb  and Postgres first to see how they do
queries.  It would be better to be compatible with them where
practical, rather than being gratuitously different.

Roger

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iEYEARECAAYFAlWmkmwACgkQmOOfHg372QTHIACgi64Sm5nAmc4FyWrF6mimT5Ng
15sAoJJoY51QW/3QM/exRLJYHNtFKBse
=EgGt
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

J Decker
On Wed, Jul 15, 2015 at 10:03 AM, Roger Binns <[hidden email]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/15/2015 08:22 AM, Sergej Jurečko wrote:
> > What do you guys think? Is it stupid, could it be improved?
>
> I recommend looking at Mongodb  and Postgres first to see how they do
> queries.  It would be better to be compatible with them where
> practical, rather than being gratuitously different.
>
>
(mentions postresql methods and MongoDB)
http://thebuild.com/presentations/pg-as-nosql-pgday-fosdem-2013.pdf
http://www.postgresql.org/docs/9.1/static/hstore.html

My question is, how do you have a intelligent (orderly, structured) query
of unstructured data?
I can imagine ways to store and retrieve such things, but how do you query
for 'give me the count of what I don't know is available' ?


http://docs.mongodb.org/manual/reference/sql-comparison/ - good comparison
of sorts...

but then when it gets to how it get implemented... they're relying on
external library isntead of 'SQL' or really any sort of query language.
(day1-2 and day 3-4 links...)
https://www.mongodb.com/blog/post/mongodb-vs-sql-day-1-2?jmp=docs&_ga=1.94603548.1409473473.1436992997
https://www.mongodb.com/blog/post/mongodb-vs-sql-day-3-5?jmp=docs&_ga=1.94603548.1409473473.1436992997

From an application standpoint, this doesn't resemble my interface to using
SQL and getting results back into code; Java looks horrible.  But then
again my library was built as an abstraction around ODBC and cleverly hides
details so applications can actually just do work....
http://api.mongodb.org/c/current/executing-command.html (the C interface to
mongodb to execute a command)

At least the hstore is a extension to the query language, and not a library.

-----------
I guess I've been looking at this strictly from a SQL standpoint, rather
than using sqlite3_* interfaces to do the work... since My library just has
3 commands... GetSQLConnection, DoSQLCommand, (okay and a set...
GetSQLRecord, GetSQLNext, EndSQL) then using these, I can defiantly see how
MonogoDB can be built as a layer on top of that using a hash storage
technique...

    table object( objid int, parent_objid int, type int, name_id int, value
varchar )
    table object_names(  name_id int, name varchar unique )

if they're grouped into some document also can add table doc( docid, name,
root_objid )

where type is
    0 - array, and all object with this objid as it's parent_id is in this
array, (ignore this_object's value; query of sub-data will ignore name...
well actually the name should be the index into the array so the order can
be preserved)
    1 - object,  (NULL value)
    2 - data (use value as int, float, bool or string)

I would also build a name dictionary and store a name_id instead of the
literal name since typically Int comparisons are faster... at least in a
well(long) named schema it would save a few bytes overall

In the day 1-2 MongoDB link he uses a 'user' master object with
'contact_number' detail object...

select_contact()
{
     select objid from object where name = 'user', and type = 1/*object*/
and parent=0
     for each object
          select * from object where name='contact_number' and type =
1/*object*/ and parent=objid
          /* do something to store in variant result data type */
}
viola JSON query in SQL via a library; that's certainly what the internals
of MongoDB result in

and again maybe RECURSIVE and WITH operators in sqlite's SQL might be
useful for simplifying the looping

// join user.contact.numbers .. users that have the same numbers
join_contacts() {
     select objid from object where name='user' and type=1 and parent=0
     for each objid
         select objid as cn,value from object where name='contact_number'
and type=1 and parent=objid
         for each cn /* contact number objid */
              results = select_by_number( value ) /* routine which is a
select user.contact_number=value */
              foreach result
                  if( result.objid == objid ) continue /* skip this one */
                  /* add to variant result */
}
some of the above could be selected with an inner join into one select, but
for simplicity broke it into more atomic steps...

/* och = object contact number, ou = object user */
   select ocn.value,ocn.objid as contact_objid from object as ou join
object as ocn on ocn.parent_objid=ou.objid where ou.value='john' and
ou.type=1 and ocn.type=2 /*and ocn.name='work'*/

if you know that you have something resembling structure to the data...

J
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Hayden Livingston
I'm beginning to understand some of the implementation challenges here.

Eric, it'd be great to get that open sourced. I can tell you at least
I will jump and prototype on it.

Sergej and Roger, I do like this potential idea as well. Of course,
now it's a custom sqlite, but at least the job gets done. Do you think
you'll want to do this anytime soon as an open source project?


On Wed, Jul 15, 2015 at 2:55 PM, J Decker <[hidden email]> wrote:

> On Wed, Jul 15, 2015 at 10:03 AM, Roger Binns <[hidden email]> wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 07/15/2015 08:22 AM, Sergej Jurečko wrote:
>> > What do you guys think? Is it stupid, could it be improved?
>>
>> I recommend looking at Mongodb  and Postgres first to see how they do
>> queries.  It would be better to be compatible with them where
>> practical, rather than being gratuitously different.
>>
>>
> (mentions postresql methods and MongoDB)
> http://thebuild.com/presentations/pg-as-nosql-pgday-fosdem-2013.pdf
> http://www.postgresql.org/docs/9.1/static/hstore.html
>
> My question is, how do you have a intelligent (orderly, structured) query
> of unstructured data?
> I can imagine ways to store and retrieve such things, but how do you query
> for 'give me the count of what I don't know is available' ?
>
>
> http://docs.mongodb.org/manual/reference/sql-comparison/ - good comparison
> of sorts...
>
> but then when it gets to how it get implemented... they're relying on
> external library isntead of 'SQL' or really any sort of query language.
> (day1-2 and day 3-4 links...)
> https://www.mongodb.com/blog/post/mongodb-vs-sql-day-1-2?jmp=docs&_ga=1.94603548.1409473473.1436992997
> https://www.mongodb.com/blog/post/mongodb-vs-sql-day-3-5?jmp=docs&_ga=1.94603548.1409473473.1436992997
>
> From an application standpoint, this doesn't resemble my interface to using
> SQL and getting results back into code; Java looks horrible.  But then
> again my library was built as an abstraction around ODBC and cleverly hides
> details so applications can actually just do work....
> http://api.mongodb.org/c/current/executing-command.html (the C interface to
> mongodb to execute a command)
>
> At least the hstore is a extension to the query language, and not a library.
>
> -----------
> I guess I've been looking at this strictly from a SQL standpoint, rather
> than using sqlite3_* interfaces to do the work... since My library just has
> 3 commands... GetSQLConnection, DoSQLCommand, (okay and a set...
> GetSQLRecord, GetSQLNext, EndSQL) then using these, I can defiantly see how
> MonogoDB can be built as a layer on top of that using a hash storage
> technique...
>
>     table object( objid int, parent_objid int, type int, name_id int, value
> varchar )
>     table object_names(  name_id int, name varchar unique )
>
> if they're grouped into some document also can add table doc( docid, name,
> root_objid )
>
> where type is
>     0 - array, and all object with this objid as it's parent_id is in this
> array, (ignore this_object's value; query of sub-data will ignore name...
> well actually the name should be the index into the array so the order can
> be preserved)
>     1 - object,  (NULL value)
>     2 - data (use value as int, float, bool or string)
>
> I would also build a name dictionary and store a name_id instead of the
> literal name since typically Int comparisons are faster... at least in a
> well(long) named schema it would save a few bytes overall
>
> In the day 1-2 MongoDB link he uses a 'user' master object with
> 'contact_number' detail object...
>
> select_contact()
> {
>      select objid from object where name = 'user', and type = 1/*object*/
> and parent=0
>      for each object
>           select * from object where name='contact_number' and type =
> 1/*object*/ and parent=objid
>           /* do something to store in variant result data type */
> }
> viola JSON query in SQL via a library; that's certainly what the internals
> of MongoDB result in
>
> and again maybe RECURSIVE and WITH operators in sqlite's SQL might be
> useful for simplifying the looping
>
> // join user.contact.numbers .. users that have the same numbers
> join_contacts() {
>      select objid from object where name='user' and type=1 and parent=0
>      for each objid
>          select objid as cn,value from object where name='contact_number'
> and type=1 and parent=objid
>          for each cn /* contact number objid */
>               results = select_by_number( value ) /* routine which is a
> select user.contact_number=value */
>               foreach result
>                   if( result.objid == objid ) continue /* skip this one */
>                   /* add to variant result */
> }
> some of the above could be selected with an inner join into one select, but
> for simplicity broke it into more atomic steps...
>
> /* och = object contact number, ou = object user */
>    select ocn.value,ocn.objid as contact_objid from object as ou join
> object as ocn on ocn.parent_objid=ou.objid where ou.value='john' and
> ou.type=1 and ocn.type=2 /*and ocn.name='work'*/
>
> if you know that you have something resembling structure to the data...
>
> J
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Sergej Jurečko
In reply to this post by Hayden Livingston
I will probably get around to doing it in the next few months. Hopefully I
have the time, right now I'm very busy. I will post to this message board
when I'm done.


Sergej
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Schema-less JSON SQLite DB?

Eduardo Morras-2
In reply to this post by Sergej Jurečko
On Wed, 15 Jul 2015 17:22:36 +0200
Sergej Jurečko <[hidden email]> wrote:

> An idea I’ve had a while ago was to implement functions for json
> documents (using sqlite3_create_function_v2)
>
> Json would be stored in a text column. You would need to define 2
> functions:
> - docget(document,key1,subval1,subval2,…)
> - docset(document,key1,subval1,subval2,..,val)
>
> Last parameter of docset is value you wish to set.
>
> So for instance one would write
> INSERT INTO mytable VALUES (1,’{key : 10, subdoc : {subkey : “a"}}’);
> SELECT id,doc FROM mytable WHERE docget(doc,”key") > 10;
> SELECT id,doc FROM mytable WHERE docget(doc,”key”,”subdoc”,”subkey”)
> = ‘a’; UPDATE doc FROM mytable WHERE id=1 SET docset(doc,”key”,12);
>
> One could even implement indexes on keys within documents using
> additional tables.
>
> What do you guys think? Is it stupid, could it be improved?

Perhaps Dr. Hipps, R. unql implementation should be resurrected or at least, used as good/bad example. It has an extension to sql to query unstructured data and a layer on top sqlite to store json data. It's abandonware now afaik.

http://unql.sqlite.org/index.html/wiki?name=UnQL
http://www.dataversity.net/unql-a-standardized-query-language-for-nosql-databases/
http://www.couchbase.com/press-releases/unql-query-language


>
> Sergej


---   ---
Eduardo Morras <[hidden email]>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users