Developer questions about the Online Backup API

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Developer questions about the Online Backup API

Simon Slavin-3
If the source database is changed while the Online Backup API is running, it returns to the beginning of the database and starts again.  I have a couple of questions which might be useful, especially if the database is changed only by the same connection as it performing the backup.

There's a chunk of headers at the beginning of the database which changes frequently.  Ignoring that chunk for a moment ... two questions:

1) Suppose the first page of the source database which is modified is  after the point that the backup has reached.  Is it necessary to restart ?  Could this be detected somehow ?

2) Suppose the first page of the source database which is modified is before the point that the backup has reached.  Could the backup not return just to that point rather than to the very beginning ?

As for the chunk of headers at the very beginning of the database, this should not change size.  It could be updated at the end.

I'm perfectly happy to be told that these optimizations cannot be performed for reasons I don't understand.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Developer questions about the Online Backup API

Clemens Ladisch
Simon Slavin wrote:
> If the source database is changed while the Online Backup API is
> running, it returns to the beginning of the database and starts again.

The backup API must create a consistent snapshot of the source database,
i.e., the result must be the exact state at some point in time when no
write transaction was active.

> 1) Suppose the first page of the source database which is modified is
>    after the point that the backup has reached.  Is it necessary to
>    restart ?
> 2) Suppose the first page of the source database which is modified is
>    before the point that the backup has reached.  Could the backup not
>    return just to that point rather than to the very beginning ?

In the general case, it is not possible to detect which pages have been
changed.

> ... if the database is changed only by the same connection as it
> performing the backup

This would require additional code to track changed pages, and a lock to
prevent other connections from making changes.


If all connections are on the same machine, it should be possible to use
WAL mode.  You can then do the entire backup in a single step without
blocking writers.


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Developer questions about the Online Backup API

Dan Kennedy-4
In reply to this post by Simon Slavin-3

On 13/3/62 22:51, Simon Slavin wrote:
> If the source database is changed while the Online Backup API is running, it returns to the beginning of the database and starts again.  I have a couple of questions which might be useful, especially if the database is changed only by the same connection as it performing the backup.

In that case - when the connection doing the backup is the same as the
one that modifies the database - the backup does not restart. Instead,
when the connection writes to the original database, any pages that have
already been copied into the backup are updated there as well.

The backup only has to restart when the connection doing the backup and
the connection doing the db modification are different connections.

 From the docs for sqlite3_backup_step():

"If the source database is modified by an external process or via a
database connection other than the one being used by the backup
operation, then the backup will be automatically restarted by the next
call to sqlite3_backup_step(). If the source database is modified by the
using the same database connection as is used by the backup operation,
then the backup database is automatically updated at the same time."

https://www.sqlite.org/c3ref/backup_finish.html#sqlite3backupstep

Dan.


>
> There's a chunk of headers at the beginning of the database which changes frequently.  Ignoring that chunk for a moment ... two questions:
>
> 1) Suppose the first page of the source database which is modified is  after the point that the backup has reached.  Is it necessary to restart ?  Could this be detected somehow ?
>
> 2) Suppose the first page of the source database which is modified is before the point that the backup has reached.  Could the backup not return just to that point rather than to the very beginning ?
>
> As for the chunk of headers at the very beginning of the database, this should not change size.  It could be updated at the end.
>
> I'm perfectly happy to be told that these optimizations cannot be performed for reasons I don't understand.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Developer questions about the Online Backup API

Olivier Mascia
> Le 14 mars 2019 à 09:27, Dan Kennedy <[hidden email]> a écrit :
>
> On 13/3/62 22:51, Simon Slavin wrote:
>> If the source database is changed while the Online Backup API is running, it returns to the beginning of the database and starts again.  I have a couple of questions which might be useful, especially if the database is changed only by the same connection as it performing the backup.
>
> In that case - when the connection doing the backup is the same as the one that modifies the database - the backup does not restart. Instead, when the connection writes to the original database, any pages that have already been copied into the backup are updated there as well.
>
> The backup only has to restart when the connection doing the backup and the connection doing the db modification are different connections.
>
> From the docs for sqlite3_backup_step():
>
> "If the source database is modified by an external process or via a database connection other than the one being used by the backup operation, then the backup will be automatically restarted by the next call to sqlite3_backup_step(). If the source database is modified by the using the same database connection as is used by the backup operation, then the backup database is automatically updated at the same time."
>
> https://www.sqlite.org/c3ref/backup_finish.html#sqlite3backupstep

Dan,

It has already been confirmed I think, but just for 100% clarity: if using WAL and a default (deferred) transaction has been started, the backup can run, step by step in the context of that reader transaction, without blocking writers and other (existing or new) readers.  Is that right?  Or could the detection of writes by other connections kicks in anyway and force the backup to needlessly restart?

--
Best Regards, Meilleures salutations, Met vriendelijke groeten,
Olivier Mascia


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users