Safe saving of in-memory db to disk file

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Safe saving of in-memory db to disk file

ardi
Hi!

I'm going to use sqlite as means of the file format I/O for
applications. One of the critical points in file I/O is saving the
file in a safe way, so that data loss cannot happen (or at least the
risk of happening is minimized as much as possible). Traditionally,
some applications save their files with a temporary name, so that in
the case of system failure you don't lose the old file and the new
file at the same time, and then, when the file saving is finished, the
old file is deleted, and the temporary one is renamed to replace it.

I have read the backup API page (https://www.sqlite.org/backup.html)
that shows how to read a sqlite db from disk to memory, and how to
save it back to disk, but it doesn't talk about the topic of
performing the save in a safe way.

Do you have any recommendation for saving the inmemory db in a safe
way? (by "safe" I mean I don't want the to lose both the old db file
and the inmemory one --however losing the inmemory db would be
reasonable, as it's what obviously happens in a power outage if you
didn't save before).

Would you do it with the sqlite API, or with the OS system calls?

Another scenario of interest would be if the db is really huge and you
consider the possibility of not overwriting the whole old file, but
just committing the changes, in order to save unnecessary disk writes.
The FAQ explains about atomic sqlite writes into the db that also
prevent data loss... but... can you do that with an inmemory db? how?

Thanks in advance!!

ardi
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Simon Slavin-3
On 14 Jul 2019, at 6:05pm, ardi <[hidden email]> wrote:

> I have read the backup API page (https://www.sqlite.org/backup.html)
> that shows how to read a sqlite db from disk to memory, and how to
> save it back to disk, but it doesn't talk about the topic of
> performing the save in a safe way.

This is a feature which SQLite provides by itself.  If you keep a database on disk, SQLite ensures that a copy of your data – either before or after a change – is always available.  It does this by changing the contents of the file, not by renaming one file and making another.

This is one of the most important features of SQLite.  It means you do not have to write your own software to make it happen.

If your computer crashes while changes are being made, SQLite is always able to rescue one copy of the data from the file.  It may be the copy before the change or the copy after the change, but it will never lose both copies.

> Do you have any recommendation for saving the inmemory db in a safe
> way?

If you make an inmemory database you are yourself responsible for making sure your data is saved on disk.  If you choose to keep your data in memory you are disabling the feature of SQLite you are asking about.

> Another scenario of interest would be if the db is really huge and you
> consider the possibility of not overwriting the whole old file, but
> just committing the changes, in order to save unnecessary disk writes.

Again, this is taken care of automatically if you keep your data on disk.  SQlite does not rewrite the entire data file when changes are made.  It rewrites only the rows which have changed.

It looks as if you have all the features you want, if only you use SQLite to keep data in a database file instead of trying to use the inmemory feature.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Jens Alfke-2
In reply to this post by ardi

> On Jul 14, 2019, at 10:05 AM, ardi <[hidden email]> wrote:
>
> Do you have any recommendation for saving the inmemory db in a safe
> way?

If you’re going to keep your data in memory, there’s no good reason to use SQLite at all. Just define custom model objects that you can operate on efficiently, and write the data out in a standard format like JSON or XML. (Most frameworks have serialization features that make this easy.)

The point of using a database is that you _don’t_ have all the data in memory, and can still access it quickly. This is attractive if you want to scale to large data sets that won’t fit in RAM or which take too long to read/write from storage. If that’s not an issue for you, don’t use a database; it just adds more complexity to your data model layer.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Dominique Devienne
On Mon, Jul 15, 2019 at 7:26 AM Jens Alfke <[hidden email]> wrote:

> > On Jul 14, 2019, at 10:05 AM, ardi <[hidden email]> wrote:
> > Do you have any recommendation for saving the inmemory db in a safe way?
>
> If you’re going to keep your data in memory, there’s no good reason to use
> SQLite at all. Just define custom model objects that you can operate on
> efficiently, and write the data out in a standard format like JSON or XML.
> (Most frameworks have serialization features that make this easy.)
> The point of using a database is that you _don’t_ have all the data in
> memory, and can still access it quickly. This is attractive if you want to
> scale to large data sets that won’t fit in RAM or which take too long to
> read/write from storage. If that’s not an issue for you, don’t use a
> database; it just adds more complexity to your data model layer.
>

I think many of us on this list would beg to differ with that advice Jens.
FWIW. --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Barry Smith
In reply to this post by Jens Alfke-2
If you use the SQLite backup api to 'backup' (save) the in-memory database
to the disk database, an atomic transaction is used to write the data to
the destination database. This means that all of SQLites usual guarantees
apply: If the operation completes, the disk database will contain a
complete, consistent copy of the in-memory database; If the operation does
not complete the next program to open the disk database will see the hot
journal and roll it back, and see a complete, consistent version of the
on-disk database before you started the backup procedure.

Note that if the database and its journal are separated, or the journal is
deleted, then this guarantee is lost and the database is corrupt. In this
way the backup API isn't much better than your other idea of copy then save.

If performance is important, this is one of the few cases when you will be
much better off using a rollback journal than a WAL journal: If you are
using a rollback journal the final COMMIT merely consists of deleting the
journal, in WAL mode all changes must be written many times.

If performance is critical and you intend to overwrite, then you can
perform VACUUM INTO a new database to disk the use OS calls to replace the
original database. Pretty sure this would be quicker than the backup API,
but you'd have to have custom rollback logic. You'd have to check if your
an OS rename and overwrite is atomic. Not sure on this one, if someone on
the list knows I'm wrong on please correct me.

You mentioned wanting to just be able to track the changes to the in-memory
database, and commit them to disk when the user presses save. SQLite has an
extension to do the hard work of this for you:
https://www.sqlite.org/sessionintro.html . You still get transactions with
all of your atomic guarantees of an always consistent database.

--------------

Off topic: There are many good reasons to use an in-memory SQLite database
for what ardi wants to do. Some of them are:
 - The biggest: You get to query your data using SQL
 - You get constraint enforcement
 - You get indices
 - You don't need to write much code to differentiate between whether
working on the in-memory db copy or an on-disk database (or indeed another
db engine)
 - Client expectations of 'open' and 'save' behaviour prevent you from
writing directly to the on-disk database.

For performance reasons I've had to write custom code where I have changing
data that I need to keep 'indices' on. I found it very difficult to get it
right, and very fragile. If I didn't need the performance, I would much
rather have used an in-memory DB (sadly it just wasn't fast enough).

On Sun, 14 Jul 2019 at 22:26, Jens Alfke <[hidden email]> wrote:

>
> > On Jul 14, 2019, at 10:05 AM, ardi <[hidden email]> wrote:
> >
> > Do you have any recommendation for saving the inmemory db in a safe
> > way?
>
> If you’re going to keep your data in memory, there’s no good reason to use
> SQLite at all. Just define custom model objects that you can operate on
> efficiently, and write the data out in a standard format like JSON or XML.
> (Most frameworks have serialization features that make this easy.)
>
> The point of using a database is that you _don’t_ have all the data in
> memory, and can still access it quickly. This is attractive if you want to
> scale to large data sets that won’t fit in RAM or which take too long to
> read/write from storage. If that’s not an issue for you, don’t use a
> database; it just adds more complexity to your data model layer.
>
> —Jens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Dominique Devienne
On Mon, Jul 15, 2019 at 8:24 AM Barry <[hidden email]> wrote:

> For performance reasons I've had to write custom code where I have changing
> data that I need to keep 'indices' on. I found it very difficult to get it
> right, and very fragile. If I didn't need the performance, I would much
> rather have used an in-memory DB (sadly it just wasn't fast enough).
>

That's when you reach for virtual tables (and their "virtual indices").

I.e. you keep your data in native data-structures (Boost.MultiIndex in my
case),
and just provide a SQLite view of it. Much faster than "pure-in-Memory" with
SQLite-managed pager-backed B-tree tables. Steep learning curve, especially
for the indexing part, but the results are well worth it IMHO.

Which can be freely mixed with "real" tables having "real" indexes (in the
in-memory DB).

FWIW. --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Dan Kennedy-4
In reply to this post by ardi

On 15/7/62 00:05, ardi wrote:

> Hi!
>
> I'm going to use sqlite as means of the file format I/O for
> applications. One of the critical points in file I/O is saving the
> file in a safe way, so that data loss cannot happen (or at least the
> risk of happening is minimized as much as possible). Traditionally,
> some applications save their files with a temporary name, so that in
> the case of system failure you don't lose the old file and the new
> file at the same time, and then, when the file saving is finished, the
> old file is deleted, and the temporary one is renamed to replace it.
>
> I have read the backup API page (https://www.sqlite.org/backup.html)
> that shows how to read a sqlite db from disk to memory, and how to
> save it back to disk, but it doesn't talk about the topic of
> performing the save in a safe way.

It's safe by default.

When you use the online backup API, the destination is written using an
SQLite transaction. So if your app or the system crashes before the
backup is complete, the transaction is rolled back following recovery.

Dan.


>
> Do you have any recommendation for saving the inmemory db in a safe
> way? (by "safe" I mean I don't want the to lose both the old db file
> and the inmemory one --however losing the inmemory db would be
> reasonable, as it's what obviously happens in a power outage if you
> didn't save before).
>
> Would you do it with the sqlite API, or with the OS system calls?
>
> Another scenario of interest would be if the db is really huge and you
> consider the possibility of not overwriting the whole old file, but
> just committing the changes, in order to save unnecessary disk writes.
> The FAQ explains about atomic sqlite writes into the db that also
> prevent data loss... but... can you do that with an inmemory db? how?
>
> Thanks in advance!!
>
> ardi
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Richard Hipp-3
In reply to this post by Jens Alfke-2
On 7/15/19, Jens Alfke <[hidden email]> wrote:
>
> If you’re going to keep your data in memory, there’s no good reason to use
> SQLite at all.

Yeah there is.  SQLite has a high-level query language (SQL) that can
radically simplify application development.
--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Nelson, Erik - 2
In reply to this post by Dominique Devienne
Dominique Devienne wrote on Monday, July 15, 2019 2:41 AM

>That's when you reach for virtual tables (and their "virtual indices").

>I.e. you keep your data in native data-structures (Boost.MultiIndex in my case),
>and just provide a SQLite view of it. Much faster than "pure-in-Memory" with
> SQLite-managed pager-backed B-tree tables. Steep learning curve, especially
>for the indexing part, but the results are well worth it IMHO.

> Which can be freely mixed with "real" tables having "real" indexes (in the in-memory DB).

That sounds really intriguing- does it significantly speed up queries coming in through the SQLite engine?  Or the speed bump is only if accessing from the C++-native side?

Is there any literature out there or tips you can share that can flatten the learning curve?

Thanks

Erik

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

ardi
In reply to this post by Richard Hipp-3
On Mon, Jul 15, 2019 at 12:57 PM Richard Hipp <[hidden email]> wrote:
>
> On 7/15/19, Jens Alfke <[hidden email]> wrote:
> >
> > If you’re going to keep your data in memory, there’s no good reason to use
> > SQLite at all.
>
> Yeah there is.  SQLite has a high-level query language (SQL) that can
> radically simplify application development.

In fact, the "idea" of adopting it for app I/O came just when reading
this page in the SQLite website: https://www.sqlite.org/whentouse.html
 ...that page made me realize this is the standard serialization I
want for all (or at least most) of my applications from today on.

Thanks a lot, a lot, a lot!! for all the useful comments posted in
this thread... I didn't know the backup API had the atomic safety. I
need to study all the information you gave me.

Anyway, if I decide to go the "easy way" and perform a complete file
overwrite whenever the application saves the document, would you first
generate a file database with a temporary name without removing the
older file, and then when saving is finished remove the old one and
rename the new? Or wouldn't it be necessary?

Thanks a lot!!

ardi
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Jens Alfke-2
In reply to this post by Richard Hipp-3


> On Jul 15, 2019, at 3:57 AM, Richard Hipp <[hidden email]> wrote:
>
> Yeah there is.  SQLite has a high-level query language (SQL) that can
> radically simplify application development.

I guess it depends on your programming style. Most apps whose architecture I’m aware of* either wrap something like an ORM (e.g. Core Data) around the database to expose it as native objects, or they just hand-write a CRUD layer to persist their objects. Which is much more complexity than just the classes themselves with a bit of export/import code.

I agree that SQL queries are powerful, but LINQ demonstrates that you can do the same kind of queries using functional-programming operations like `map`, `filter`, `fold`, etc. without having the impedance-mismatch of gluing two extremely different languages together.

I love SQLite! And I have been using & evangelizing it since 2004. But the fact that it’s a very elegant and powerful hammer doesn’t mean that it’s the best tool for pushing thumbtacks into cork boards. :)

(Even if you disagree with the above, I don’t see how one can think that it’s a good idea to read the entire database into memory, work with it there, and then write it all back out to disk. That’s a lot of unnecessary I/O and RAM usage. The only exception I can think of is if the storage medium can’t be counted on to remain online during operation.)

—Jens

* my background is in macOS and iOS, btw
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Adrian Ho
In reply to this post by Nelson, Erik - 2
On 15/7/19 11:25 PM, Nelson, Erik - 2 wrote:
> Dominique Devienne wrote on Monday, July 15, 2019 2:41 AM
>
>> That's when you reach for virtual tables (and their "virtual indices").
>> I.e. you keep your data in native data-structures (Boost.MultiIndex in my case),
>> and just provide a SQLite view of it. Much faster than "pure-in-Memory" with
>> SQLite-managed pager-backed B-tree tables. Steep learning curve, especially
>> for the indexing part, but the results are well worth it IMHO.
>> Which can be freely mixed with "real" tables having "real" indexes (in the in-memory DB).
> That sounds really intriguing- does it significantly speed up queries coming in through the SQLite engine?  Or the speed bump is only if accessing from the C++-native side?

The whole point of virtual tables is to make something outside an actual
SQLite DB accessible via normal SQL queries within SQLite; you'd still
use the normal access methods (pointer deferencing, <map> lookups, etc.)
from the native side.

Speedup depends entirely on the efficiency of your implementation, of
course.

> Is there any literature out there or tips you can share that can flatten the learning curve?

The definitive documentation on SQLite virtual tables is here:
https://sqlite.org/vtab.html

A list of implementation examples is here: https://sqlite.org/vtablist.html

Of which, a (relatively) simple example of exposing a C array is
described here: https://sqlite.org/carray.html

And whose source code is in your SQLite source distribution under
ext/misc/carray.c, or online here:
https://www.sqlite.org/src/artifact?ci=trunk&filename=ext/misc/carray.c


Best Regards,
Adrian

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Safe saving of in-memory db to disk file

Dominique Devienne
On Tue, Jul 16, 2019 at 4:44 AM Adrian Ho <[hidden email]> wrote:

> On 15/7/19 11:25 PM, Nelson, Erik - 2 wrote:
> > Dominique Devienne wrote on Monday, July 15, 2019 2:41 AM
> >> That's when you reach for virtual tables (and their "virtual indices").
> >> I.e. you keep your data in native data-structures (Boost.MultiIndex in
> my case),
> >> and just provide a SQLite view of it. Much faster than "pure-in-Memory"
> with
> >> SQLite-managed pager-backed B-tree tables. Steep learning curve,
> especially
> >> for the indexing part, but the results are well worth it IMHO.
> >> Which can be freely mixed with "real" tables having "real" indexes (in
> the in-memory DB).
>


> > That sounds really intriguing- does it significantly speed up queries
> coming in through the SQLite engine?

> Or the speed bump is only if accessing from the C++-native side?
>

It's been years since benchmarking this, but you can get 5x faster in some
use cases
(compared to in-memory DB, i.e. no disk IO at all, assuming no swapping of
course).

See, data in SQLite is "packed" into pages, in a compact format, that must
be decoded/encoded
on all reads/writes. Pages must be copied around. BTrees must be traversed,
which means decoding
pages again to traverse them. While your vtables reference C++ structs,
with fields of "native" values.
If your "vtables" are represented by hashes C++ collections, then a
by-primary-key index access is
super fast, and accessing the fields/columns of the selected row needs no
unpacking either.

Even full scans are typically faster too.

The whole point of virtual tables is to make something outside an actual
> SQLite DB accessible via normal SQL queries within SQLite; you'd still
> use the normal access methods (pointer deferencing, <map> lookups, etc.)
> from the native side.
>

Right. The vtables just provide another mechanism to access the same data
already part
of the application and its data model, but unlike native code, it's
flexible and "runtime" configurable.

You can still always access the native C++ API accessing the same
containers and data structures.
But as Richard mentions, the amount of C++ code you replace by crafting a
non-trivial query is sometimes
quite large, and it's often less buggy to write the SQL than the C++.
SQLite is fast enough that unless
it's performance critical code, performance is not an issue at all. Just
look at DRH's own Fossil, and the
super complex queries it runs in mere milliseconds or less.

Once you have SQLite embedded into your app, it tends to sip through, to
all settings/preferences
readily available in (typically "real") tables of the in-memory DB. Easily
queryable and accessible.
And even runtime editable if you allow it (see also "authorizer" in the
doc).

And you can also use it as your data format, of course, as often mentioned.
Even if you "serialize"
your whole memory state on each save, it's still efficient. And allows a
path forward to one day
enable incremental updates of the state as it changes.

IMHO, most desktop apps would greatly benefit from in-memory SQLite
combined with vtables
and custom functions. It's fantastic for debugging and troubleshooting.
It's flexible and dynamic,
allowing to change behavior or appearance of your app and UI at runtime if
you design for it.
SQLite is just a wonderful tool. It's not perfect of course, but it's so
well designed, so robust and
well tested, and so fast when used judiciously, not using it is not an
option for me at least :). --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users