fsync on -wal still happening

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

fsync on -wal still happening

Ian Freeman
I have configured journal_mode=WAL, synchronous=NORMAL, and disabled
autocheckpointing, and yet the -wal file is modified after every update
operation...

That's not the documented behavior as I understand it. This smells like
a bug to me, but before I go that direction, is there an explanation
for this behavior?

My simple test code loops an update query followed by a sleep call, and
inotifywait shows a MODIFY for each of those update queries.

I'm confused. Thanks for any light you can shed.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Richard Hipp-3
On 12/28/17, Ian Freeman <[hidden email]> wrote:
> I have configured journal_mode=WAL, synchronous=NORMAL, and disabled
> autocheckpointing, and yet the -wal file is modified after every update
> operation...

That is correct.  "sync" means forcing the data from the internal
operating-system buffers out to persistent storage on your disk drive,
where it will survive and OS reset or power loss.  The WAL is written
on every transaction, in the sense that the content is moved into the
operating system.  But it is only synced to persistent storage on a
checkpoint.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Ian Freeman
I see, then what I'm seeing is just normal behavior of the writes being
flushed to disk. I read what I wanted to hear about synchronous=NORMAL
delaying writes to the -wal file. Instead I'm going to see if I can
move -wal to a ramdisk and see how that will affect db integrity with
power losses at in-opportune times.

Thank you.

On Fri, 2017-12-29 at 11:17 -0500, Richard Hipp wrote:

> On 12/28/17, Ian Freeman <[hidden email]> wrote:
> > I have configured journal_mode=WAL, synchronous=NORMAL, and
> > disabled
> > autocheckpointing, and yet the -wal file is modified after every
> > update
> > operation...
>
> That is correct.  "sync" means forcing the data from the internal
> operating-system buffers out to persistent storage on your disk
> drive,
> where it will survive and OS reset or power loss.  The WAL is written
> on every transaction, in the sense that the content is moved into the
> operating system.  But it is only synced to persistent storage on a
> checkpoint.
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Simon Slavin-3


On 29 Dec 2017, at 4:34pm, Ian Freeman <[hidden email]> wrote:

> I see, then what I'm seeing is just normal behavior of the writes being
> flushed to disk. I read what I wanted to hear about synchronous=NORMAL
> delaying writes to the -wal file. Instead I'm going to see if I can
> move -wal to a ramdisk and see how that will affect db integrity with
> power losses at in-opportune times.

Did you see

PRAGMA journal_mode = MEMORY

?

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Ian Freeman
Thanks, Simon. Indeed I did see that option. But I'm concerned about
maintaining integrity in the middle of an power-interrupted commit, so
I was hoping to leverage WAL's inverted behavior of not modifying the
database file directly until checkpoint time. The only thing left to
investigate is how WAL handles power loss during the middle of a
checkpoint. Normally I believe it's able to recover because the next db
open has the -wal file to look at. But not if I'm storing it in-memory.

So I'm thinking I will copy the -wal file to disk before the checkpoint
op for safety, and then load it back onto ramdisk after. I suppose I
may have to exit WAL mode in order to perform that copy, making
checkpointing a very expensive operation for my app.

On Fri, 2017-12-29 at 16:48 +0000, Simon Slavin wrote:

>
> On 29 Dec 2017, at 4:34pm, Ian Freeman <[hidden email]> wrote:
>
> > I see, then what I'm seeing is just normal behavior of the writes
> > being
> > flushed to disk. I read what I wanted to hear about
> > synchronous=NORMAL
> > delaying writes to the -wal file. Instead I'm going to see if I can
> > move -wal to a ramdisk and see how that will affect db integrity
> > with
> > power losses at in-opportune times.
>
> Did you see
>
> PRAGMA journal_mode = MEMORY
>
> ?
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Barry Smith
I believe the inotifywait does not actually wait for the fsync operation before notifying.

Process A can write to a file, the OS can cache it without flushing to disk, and a request by process B will be served directly from the cache. Therefore the operating system can notify you of the change before it's written to persistent storage.

So, in normal synchronous mode, SQLite doesn't cache its writes, it just waits before calling fsync and relies on the OS cache.

Perhaps benchmarking before engaging in such an endeavour as you plan would indicate whether the OS cache gives you sufficient performance.

> On 29 Dec 2017, at 12:07 pm, Ian Freeman <[hidden email]> wrote:
>
> Thanks, Simon. Indeed I did see that option. But I'm concerned about
> maintaining integrity in the middle of an power-interrupted commit, so
> I was hoping to leverage WAL's inverted behavior of not modifying the
> database file directly until checkpoint time. The only thing left to
> investigate is how WAL handles power loss during the middle of a
> checkpoint. Normally I believe it's able to recover because the next db
> open has the -wal file to look at. But not if I'm storing it in-memory.
>
> So I'm thinking I will copy the -wal file to disk before the checkpoint
> op for safety, and then load it back onto ramdisk after. I suppose I
> may have to exit WAL mode in order to perform that copy, making
> checkpointing a very expensive operation for my app.
>
>>> On Fri, 2017-12-29 at 16:48 +0000, Simon Slavin wrote:
>>>
>>> On 29 Dec 2017, at 4:34pm, Ian Freeman <[hidden email]> wrote:
>>>
>>> I see, then what I'm seeing is just normal behavior of the writes
>>> being
>>> flushed to disk. I read what I wanted to hear about
>>> synchronous=NORMAL
>>> delaying writes to the -wal file. Instead I'm going to see if I can
>>> move -wal to a ramdisk and see how that will affect db integrity
>>> with
>>> power losses at in-opportune times.
>>
>> Did you see
>>
>> PRAGMA journal_mode = MEMORY
>>
>> ?
>>
>> Simon.
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Simon Slavin-3
In reply to this post by Ian Freeman


On 29 Dec 2017, at 7:07pm, Ian Freeman <[hidden email]> wrote:

> Thanks, Simon. Indeed I did see that option. But I'm concerned about
> maintaining integrity in the middle of an power-interrupted commit,

This is not a problem for either WAL mode or the older journal modes.  If you avoid these commands

PRAGMA journal_mode = OFF
PRAGMA journal_mode = MEMORY
PRAGMA synchronous = OFF
PRAGMA synchronous = NORMAL

then SQLite will ensure that the database file always reflects your data as it was before the more recent COMMIT or after the most recent COMMIT.

SQLite has a lot of code in to ensure the best handling in the case of power-loss.  Each time a database file is reopened SQLite looks for tell-tales that it wasn’t properly closed, and uses a number of strategies to restore one of those two situations.  This includes comparing the state of the database file with the states of the associated files (-wal, -shm, etc.).

Because of this, do not delete/rename any of the associated files even when SQLite does not have them open.  SQLite must be able to find these files /in the same folder as the database file/.  If you ever back up one of those files, back them all up, including the database file, as a unit.  They are useless without copies of all the other files as they were at the same time.

Since the "synchronous" setting is default, the advice for installations where power-loss is not unexpected is to issue

PRAGMA synchronous = EXTRA

, to have the database in WAL mode if you want it (neither safer nor less safer with respect to power-loss), and leave everything else as it is.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Ian Freeman
On Fri, 2017-12-29 at 22:18 +0000, Simon Slavin wrote:

>
> On 29 Dec 2017, at 7:07pm, Ian Freeman <[hidden email]> wrote:
>
> > Thanks, Simon. Indeed I did see that option. But I'm concerned
> > about
> > maintaining integrity in the middle of an power-interrupted commit,
>
> This is not a problem for either WAL mode or the older journal
> modes.  If you avoid these commands
>
> PRAGMA journal_mode = OFF
> PRAGMA journal_mode = MEMORY
> PRAGMA synchronous = OFF
> PRAGMA synchronous = NORMAL
>
> then SQLite will ensure that the database file always reflects your
> data as it was before the more recent COMMIT or after the most recent
> COMMIT.
>

Agreed, the defaults would be very safe. But I realize I never
mentioned what my actual goal was: minimizing disk writes for a write-
heavy application. That's why I want to run -wal out of memory, or set
synchronous=NORMAL. Of course, I cannot do this at the expense of
integrity.

There are other ways to solve my problem, for example I could operate
solely in memory and perform periodic online backups. But then I'm
dumping entire databases to disk when I really only need to capture a
single row's update, etc. Those could be minimized with breaking up the
database into sub-sections, etc. But if I can just keep the -wal
updates off of the disk until checkpoint, it would be the best
solution.

Thanks, all. I'm still open to your appreciated input as I mull this
over.


On Fri, 2017-12-29 at 15:09 -0700, Barry Smith wrote:
> I believe the inotifywait does not actually wait for the fsync
> operation before notifying.

Barry Smith is making me question my inotifywait results. I need to
look into that further.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Simon Slavin-3
On 30 Dec 2017, at 5:37am, Ian Freeman <[hidden email]> wrote:

> Agreed, the defaults would be very safe. But I realize I never
> mentioned what my actual goal was: minimizing disk writes for a write-
> heavy application. That's why I want to run -wal out of memory, or set
> synchronous=NORMAL. Of course, I cannot do this at the expense of
> integrity.

Accumulate your change commands in memory as text.  Eventually execute them all inside BEGIN … COMMIT.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Ian Freeman
On Sat, 2017-12-30 at 05:40 +0000, Simon Slavin wrote:
> Accumulate your change commands in memory as text.  Eventually
> execute them all inside BEGIN … COMMIT.

I like it; why didn't I think of it? Don't you love those moments?

The application would need severe retooling for reads unless I also
have duplicate memory versions of the table state, but it is a
solution. With this scheme you could also load the entire db into
memory without the expense of having to dump the entire db to disk at
"checkpoints".
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Keith Medcalf
In reply to this post by Ian Freeman

One presumes that you mayhaps read the documentation?

"In WAL mode when synchronous is NORMAL (1), the WAL file is synchronized before each checkpoint and the database file is synchronized after each completed checkpoint and the WAL file header is synchronized when a WAL file begins to be reused after a checkpoint, but no sync operations occur during most transactions. With synchronous=FULL in WAL mode, an additional sync operation of the WAL file happens after each transaction commit. The extra WAL sync following each transaction help ensure that transactions are durable across a power loss. Transactions are consistent with or without the extra syncs provided by synchronous=FULL. If durability is not a concern, then synchronous=NORMAL is normally all one needs in WAL mode."

So, the question is whether or not you want D in your ACID.  If all you need is ACI, then synchronous=NORMAL guarantees that.  synchronous=FULL adds the D to your ACI giving you ACID.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Ian Freeman
>Sent: Friday, 29 December, 2017 22:38
>To: SQLite mailing list
>Subject: Re: [sqlite] fsync on -wal still happening
>
>On Fri, 2017-12-29 at 22:18 +0000, Simon Slavin wrote:
>>
>> On 29 Dec 2017, at 7:07pm, Ian Freeman <[hidden email]> wrote:
>>
>> > Thanks, Simon. Indeed I did see that option. But I'm concerned
>> > about
>> > maintaining integrity in the middle of an power-interrupted
>commit,
>>
>> This is not a problem for either WAL mode or the older journal
>> modes.  If you avoid these commands
>>
>> PRAGMA journal_mode = OFF
>> PRAGMA journal_mode = MEMORY
>> PRAGMA synchronous = OFF
>> PRAGMA synchronous = NORMAL
>>
>> then SQLite will ensure that the database file always reflects your
>> data as it was before the more recent COMMIT or after the most
>recent
>> COMMIT.
>>
>
>Agreed, the defaults would be very safe. But I realize I never
>mentioned what my actual goal was: minimizing disk writes for a
>write-
>heavy application. That's why I want to run -wal out of memory, or
>set
>synchronous=NORMAL. Of course, I cannot do this at the expense of
>integrity.
>
>There are other ways to solve my problem, for example I could operate
>solely in memory and perform periodic online backups. But then I'm
>dumping entire databases to disk when I really only need to capture a
>single row's update, etc. Those could be minimized with breaking up
>the
>database into sub-sections, etc. But if I can just keep the -wal
>updates off of the disk until checkpoint, it would be the best
>solution.
>
>Thanks, all. I'm still open to your appreciated input as I mull this
>over.
>
>
>On Fri, 2017-12-29 at 15:09 -0700, Barry Smith wrote:
>> I believe the inotifywait does not actually wait for the fsync
>> operation before notifying.
>
>Barry Smith is making me question my inotifywait results. I need to
>look into that further.
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Ian Freeman
On 30.12.2017 01:14, Keith Medcalf wrote:

> One presumes that you mayhaps read the documentation?
>
> "In WAL mode when synchronous is NORMAL (1), the WAL file is
> synchronized before each checkpoint and the database file is
> synchronized after each completed checkpoint and the WAL file header
> is synchronized when a WAL file begins to be reused after a
> checkpoint, but no sync operations occur during most transactions.
> With synchronous=FULL in WAL mode, an additional sync operation of the
> WAL file happens after each transaction commit. The extra WAL sync
> following each transaction help ensure that transactions are durable
> across a power loss. Transactions are consistent with or without the
> extra syncs provided by synchronous=FULL. If durability is not a
> concern, then synchronous=NORMAL is normally all one needs in WAL
> mode."
>
> So, the question is whether or not you want D in your ACID.  If all
> you need is ACI, then synchronous=NORMAL guarantees that.
> synchronous=FULL adds the D to your ACI giving you ACID.

Naturally, and of course. The point of my initial post was that I was
still seeing sync operations with synch=NORMAL when I shouldn't have
according to the below docs (now debatable whether they were actually
written to disk or just the os cache). I purposely configured synch to
NORMAL to avoid the additional sync to the -wal file, but was still
seeing it in operation.

Supposing I am able to determine if the MODIFY operation is or is not an
actual disk write (a quick look at the inotify API suggests this is not
possible, I next have to look at the filesystem APIs), I may or may not
have to operate my own in-memory WAL as previously discussed to avoid
this unwanted behavior. Regardless, it may be that I'm attempting to
squeeze too much savings out of a delayed sync operation to begin with,
and need to look towards another method to begin with.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Simon Slavin-3


On 30 Dec 2017, at 4:38pm, Ian Freeman <[hidden email]> wrote:

> Naturally, and of course. The point of my initial post was that I was still seeing sync operations with synch=NORMAL when I shouldn't have according to the below docs (now debatable whether they were actually written to disk or just the os cache). I purposely configured synch to NORMAL to avoid the additional sync to the -wal file, but was still seeing it in operation.

I can explain that.

Sync happens automatically when the OS wants it to happen.  This may be when a cache busts and has to be written to disk, though some OSen have a regular timer and do a sync every minute (for example) and others trigger a sync when swapping a process out.  You cannot prevent these syncs at the App level: the OS or File System triggers them.

What the SQLite setting does is tell SQLite to ask for /additional/ syncs when SQLite thinks they’re needed.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: fsync on -wal still happening

Keith Medcalf
In reply to this post by Ian Freeman

>Naturally, and of course. The point of my initial post was that I was
>still seeing sync operations with synch=NORMAL when I shouldn't have
>according to the below docs (now debatable whether they were actually
>written to disk or just the os cache). I purposely configured synch
>to
>NORMAL to avoid the additional sync to the -wal file, but was still
>seeing it in operation.

Interesting.  In fact I believe I saw the same problem/issue with batch loading of a "bunch" of files into a database.  Re-running the update is not a problem but database consistency was.  Interestingly, I ended up using savepoint's for each batch and doing one commit to the WAL file.  The I/O rate was greatly reduced even though the update (import) touched a whole raft of database pages.  This used synchronous=normal.  If something crashed the database was consistent and the "load" operation could simply be done again.  (I/O rate was reduced from huge multi MB/s with the begin-commit to the KB/s with the savepoints and there did not appear to be any fsyncs other than when the transaction was committed (which also did a checkpoint)).

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users