SQLite with branching

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

SQLite with branching

Simon Slavin-3
I have no connection with the following project.

<https://github.com/aergoio/litetree>

Described poorly on the web site so here's my own description:

This is an extension of SQLite which allows branched versions, each new branch creating one dataset which existed before the new branch and a new dataset, initially a copy of the old dataset at some historical point, which can be further modified.  Both the old and new branches can be further branched.

LiteTree is implemented storing the SQLite db pages on LMDB making it more than twice as fast as normal SQLite on Linux and MacOSX, and also runs on Windows.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Robert M. Münch
@SQLite Guys: Do you have something like branching on your roadmap? I really like this feature and see a lot of use-cases beside the blockchain topic. And, of course if this works with your encryption extension that would be awesome.

* simple versioning of a database: Useful when you want to keep different app states (like complex calculations) and be able to go back and forth in time to see what data was used, what changed, what is the impact of that change.

* simple implementation of alternatives: create a branch and let the user do whatever the want. Later they can keep it or go back. I think using changesets can be a bit more challenging for such a use-case.

* I see such a feature as a natural companion to the session extension. Where versioning/branching works on the whole database.

Viele Grüsse. Robert M. Münch


On 29 Aug 2018, at 14:28, Simon Slavin wrote:

> I have no connection with the following project.
>
> <https://github.com/aergoio/litetree>
>
> Described poorly on the web site so here's my own description:
>
> This is an extension of SQLite which allows branched versions, each new branch creating one dataset which existed before the new branch and a new dataset, initially a copy of the old dataset at some historical point, which can be further modified.  Both the old and new branches can be further branched.
>
> LiteTree is implemented storing the SQLite db pages on LMDB making it more than twice as fast as normal SQLite on Linux and MacOSX, and also runs on Windows.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

signature.asc (891 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Simon Slavin-3
The post you quoted points to exactly that: a version of SQLite that handles branches.  Check it out.

That's one of the reasons that the source code for SQLite is public: so that people can add the features they want.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Jens Alfke-2


> On Nov 4, 2019, at 4:57 AM, Simon Slavin <[hidden email]> wrote:
>
> That's one of the reasons that the source code for SQLite is public: so that people can add the features they want.

Totally agree. However, when you go off the mainline of SQLite you lose some things, like easy updating to new SQLite releases — you now have to deal with merging the new official SQLite into the forked SQLite, or waiting for the fork maintainer to do it.

In the case of LiteTree, I suspect the merge would be pretty difficult because of the extensive changes — it must be replacing the whole B-tree layer to be using LMDB as storage. (There was an earlier project called SQLightning that did the same thing. I was tempted by it, but it was based on an old version like 3.9 and the author made it clear he had zero interest in updating.)

I don't have a practical use for the branching features, though they're cool, but I'm salivating at the thought of a 2x speedup. With all the work that's put into eking out small performance increases in SQLite, I'd imagine the devs would be interested in something that made that big of a difference...

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

wmertens
On Mon, Nov 4, 2019 at 10:26 PM Jens Alfke <[hidden email]> wrote:

> I don't have a practical use for the branching features, though they're cool, but I'm salivating at the thought of a 2x speedup.
> With all the work that's put into eking out small performance increases in SQLite, I'd imagine the devs would be interested in
> something that made that big of a difference...

What I would like to know is how such a performance increase is
achieved, and why regular SQLite can't do the same?

Wout.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Dominique Devienne
On Tue, Nov 5, 2019 at 10:01 AM Wout Mertens <[hidden email]> wrote:

> On Mon, Nov 4, 2019 at 10:26 PM Jens Alfke <[hidden email]> wrote:
>
> > I don't have a practical use for the branching features, though they're
> cool, but I'm salivating at the thought of a 2x speedup.
> > With all the work that's put into eking out small performance increases
> in SQLite, I'd imagine the devs would be interested in
> > something that made that big of a difference...
>
> What I would like to know is how such a performance increase is
> achieved, and why regular SQLite can't do the same?
>

AFAIK, that was one of the goals of SQLite4 [1], to change the backend to
LSM.
We know now SQLite4 is basically abandoned, but LSM was refactored as an
SQLite3 extension [2].
Here's an article that goes into more depth on the subject [3]. Hope this
helps. --DD

[1] https://sqlite.org/src4/doc/trunk/www/index.wiki
[2] https://www.sqlite.org/src/dir?ci=5710845b6314f924&name=ext/lsm1
[3] https://charlesleifer.com/blog/lsm-key-value-storage-in-sqlite3/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Jens Alfke-2


> On Nov 5, 2019, at 1:27 AM, Dominique Devienne <[hidden email]> wrote:
>
> AFAIK, that was one of the goals of SQLite4 [1], to change the backend to LSM.

LMDB (LiteTree's back-end) doesn't use LSM; it's a B-tree manager. The speedup appears to come from a combination of techniques like eliminating caching through memory-mapping, and eliminating locking through MVCC. There's a paper describing it in somewhat more detail[1].

I know there is memory-mapping support in SQLite, but it doesn't seem to result in as much speedup, and it has some nasty data-corruption bugs on macOS (possibly iOS too?)

LSM and LMDB would seem to have differing goals — LSM is best for applications with high write throughput, while LMDB is optimized more for read performance. I would guess that the latter is closer to the majority of SQLite use cases, since small/embedded systems tend to take in less data than big servers do.

—Jens

[1]: http://www.lmdb.tech/media/20120829-LinuxCon-MDB-txt.pdf
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Bernardino Ramos
In reply to this post by Simon Slavin-3
Hi!

I am the creator of LiteTree (also LiteReplica, LiteSync and 3 new
products that will be released soon).

When I was planning to add branching I discovered many ways to implement
it. I selected the one that satisfied performance over disk usage. It
can also be implemented the other way around, with low disk space and
slower. Having both high performance and low disk usage is really hard.

The performance comes from using LMDB and from fine-tuning it to reach
this goal (in a safe way).

But it comes with a disadvantage: it uses a lot of disk space when
compared with a normal SQLite db file.

The reason: all past states of the database must be stored if we want to
be able to create new branches from any place, as well as to navigate
the database at any previous point in time.

It does not store the entire db state at each point-in-time, just the
modified pages compared to the previous point.

It also requires a considerable amount of virtual memory space, as (I
guess) in any memory mapped solution.

If you are interested in just the performance without the branching
feature, there are at least 3 options:

1. SQLigthning: I was thinking in updating it to the last version of
SQLite
2. Modified version of LiteTree, without branches
3. SQLite with mmap

I confess that I have not tried SQLite with mmap yet. So maybe it is as
fast as LiteTree, or even faster. IDK

Do not forget that all these 3 options use memory mapping. Consider this
on IoT devices and 32-bit processors.

Options 1 and 2 were in my list, but now I have more important products
being implemented. And option 3 may solve the requirement anyway.


Now let me uncover some differences here:

SQLigthning: Stores SQLite db's rows on LMDB

LiteTree: Stores SQLite db's entire pages on LMDB


Or, showing by SQLite interface level:

SQLigthning: B-tree level

LiteTree: Pager/WAL level


You may wonder how storing an entire db page on another db could be
fast... one trick is to use the SQLite's reserved space feature on each
page, matching the size of the header for overflow pages on LMDB. In
this way a SQLite db page is stored using exactly 4096 bytes on LMDB!
(not counting the required b-tree index)

The same trick could be applied for another WAL file format in a way
that each db page would be stored exactly at disk sector boundaries, off
course having the WAL header using an entire page. The reserved space on
this case would be the same size of a WAL page header. This would not
change the write speed but could make the read of random pages on WAL a
little faster.

This also comes with a disadvantage of using a little more disk space
than normal, and it is not compatible with existing dbs (a new file with
reserved space on each page should be created). So it could only exist
as an option (extension?) or separate product.

In some of my projects I modified the WAL module so the interface is
pluggable like VFS.

But yeah, I do not know whether the results would compensate the effort
on the main trunk.

In some cases it is better to implement a virtual table instead.

Anyway, all of these modifications and derived products are only
possible due to the spectacular work of Richard, as well as Dan and
Mistachkin.

Thank you so much!
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Jens Alfke-2


> On Nov 7, 2019, at 9:02 AM, Bernardo Ramos <[hidden email]> wrote:
>
> If you are interested in just the performance without the branching feature, there are at least 3 options:
>
> 1. SQLigthning: I was thinking in updating it to the last version of SQLite

That would be awesome! I have looked at it a few times, but it's based on such an old version that it's useless to me in its current state.

> 2. Modified version of LiteTree, without branches

I'm curious how performance of LiteTree (w/o branching) compares to SQLightning, i.e. whether storing rows or pages is more efficient. Have you measured?

> 3. SQLite with mmap

IIRC, this is not nearly as fast as either of the LMDB-based approaches. I don't know why; presumably SQLite doesn't make as efficient use of memory-mapping.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Bernardino Ramos
In reply to this post by Simon Slavin-3

I included WAL mode and mmap on the LiteTree simple benchmark.

It turns out that WAL mode is as fast as LiteTree on Linux (with a hard
disk) for writes and a little slower on reads.

On MacBook Pro (with SSD) LiteTree is faster on both writing and
reading.

SQLite's mmap make it slightly faster than just with WAL. It is faster
than LiteTree on reads (no page data copy on both cases). But this
depends on the benchmark code. Sometimes it is slower than using just
WAL mode (apparently with small dbs).

Sometimes mmap is way faster than all others (on a virtual machine:
Windows hosting Linux).

But honestly, I do not know the reason of these differences.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite with branching

Robert M. Münch
In reply to this post by Jens Alfke-2
On 4 Nov 2019, at 22:25, Jens Alfke wrote:

>> On Nov 4, 2019, at 4:57 AM, Simon Slavin <[hidden email]> wrote:
>>
>> That's one of the reasons that the source code for SQLite is public: so that people can add the features they want.
>
> Totally agree. However, when you go off the mainline of SQLite you lose some things, like easy updating to new SQLite releases — you now have to deal with merging the new official SQLite into the forked SQLite, or waiting for the fork maintainer to do it.

For such fundamental and big changes, that’s exactly the problem.

For the mainline SQLite we know that the code-base is regularly maintained and updated and that the official extensions work. Hence, I can base a product on it.

For such a big new feature/method, which would be at the core of my product, I either have to maintain it myself (heavy) or hope that it’s not a dead-end… the risk profile is just not good.

Viele Grüsse.

--

Robert M. Münch

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

signature.asc (891 bytes) Download Attachment