Database locked problem on Windows 7

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Database locked problem on Windows 7

Yngve N. Pettersen
Hello all,

I am working on a project involving the Python SQLite3 API for SQLite (a
compile cache system), and I have started running into "database locked"
problems on the Windows 7 Pro machines some of the instances will run on;
the Windows 10 instances works without any problems.

The database is configured with WAL journaling, and when deployed will
have up to 42 active connection at a time. There are three tables, one
containing a blob with information about a source file and a time stamp
(the timestamp is updated each time the entry is used, the blob may be
updated), one with a blob of binary data, and the third have the time
stamp for the corresponding entry in the second table (updated each time
the blob entry is used). The Python SQLite3 connection is started with a
timeout of 100 seconds.

The lock problem does not appear when seeding the database, but that
process involves compiling the source first, so database updates should
be less frequent.

The problem occurs when the data is (mostly) only being pulled out of the
database for reuse, followed by an update of the time stamp the rows used  
(to be
used when removing old entries).

Using a locally built DLL with some printfs I found that it looks like the
problem occurs when winLock/winLockFile is called from sqlite3WalClose to
obtain an exclusive lock; it seems to fail in the Exclusive lock part of
the function. The reported windows error code is 33.

For reference, the folder where the database is stored, is excluded from
AV scans, the disk drives are all local SSDs. The size of the DB is
currently about 4GB. The SQLite DLL version from Python 3.6 (can't use
3.7) is 3.21 (no change with 3.24).

I have found a few "workarounds", most of them not remotely satisfactory:

- Increase busy_timeout to at least 40 seconds, wal_autocheckpoint to
3000+ seems to work, although I have had failures in this scenario, too.
The downside is that this takes (at least) as long as seeding the
database, and 5-6 times as long as the system it is supposed to replace,
and 10(!) times as long as the corresponding tests on Windows 10 (and 7
times as long as on my own Win10 machine with half the cores).

- Use defaults for busy_timeout, wal_autocheckpoint, and use a Windows
mutex to lock access to the database during write operation. This *is*
much faster than the previous example, but that is the best that can be
said for it. It is still 10-20% slower than the old system, takes twice as
long as the Win 10 test (and 20-40% longer than tests on my own machine,
with half the cores). Trying to use multiple mutexes only took longer.

- I tried resuming the operation after a lock, that was also slower than
both the old system and the Win 10 system, and it also ran into trouble
when a repeated insert operation broke the uniqeness constraint (the
transaction had completed and been committed before the database locked
error occurred).

- Reducing number of parallel jobs to 20 on the 32 (logical) core machine
worked (24 failed almost at the end), but the reuse case still took close
to 4 times as long as the old system (and 2/3 of the time used for a
normal seeding using). OTOH, using 20 jobs on a 40 core Win 10 machine
took only twice as long as using all cores (effectively what the old
system used).

- I also tried to rebuild the database without WAL. Seeding the database
took 3 times longer than it did with WAL, and the second stage (reuse)
seems to take as long (stopped it when it had taken longer than a normal
seed, when it was just 25% finished).

(And yes, upgrading to Win 10 may be a possibility, although at least two
of the machines was not able to use Win 10 last time I tried.)

I don't know if this is a problem caused by a problem in SQLite, or if it
as limitation in Windows 7.

That halving the number of jobs takes 4 times as long on Win 7 as the old
job (which indicates that a all cores would still require double the time
of the old system), while the Win 10 version was running just doubling (as
expected) the its original time, which was half of the old system's time,
seems to indicate that there is a performance issue with SQLite on Windows  
7
(and I don't see similar issues with other applications on these machines).
Additionally, since the *mutex* option actually got the run time within
shouting range of the old system for the data reuse case, my guess is that
it is _not_ a file performance problem with Windows 7. OTOH there is still  
a
possibility that there is a Win 7 file locking performance issue, but I  
doubt
it.

Does anyone have any suggestions for how to fix/work around this problem
without losing too much (or any) efficiency?

Thanks in advance.
--
Sincerely,
Yngve N. Pettersen
Vivaldi Technologies AS
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Database locked problem on Windows 7

Yngve N. Pettersen
Hello all,

About four weeks ago, I sent the email quoted below to the list, but  
unfortunately, as far as I can tell, I have so far received no response.

TL;DR: Concurrent attempts to exclusively lock the database tend to fail  
on Windows 7 Pro, there is no similar problem on Windows 10.

The only real updates from what I wrote earlier is that 1) a small  
reorganization of the insertion queries and a single global mutex lock got  
the run time down to about the same as the old system, which is still  
about double the time of Window 10. 2) Using more than one mutex and  
relying on the sqlite exclusive lock to arbitrate between them  
significantly increased the runtime (triple or more).


On Thu, 05 Jul 2018 20:33:44 +0200, Yngve N. Pettersen <[hidden email]>  
wrote:

> Hello all,
>
> I am working on a project involving the Python SQLite3 API for SQLite (a
> compile cache system), and I have started running into "database locked"
> problems on the Windows 7 Pro machines some of the instances will run on;
> the Windows 10 instances works without any problems.
>
> The database is configured with WAL journaling, and when deployed will
> have up to 42 active connection at a time. There are three tables, one
> containing a blob with information about a source file and a time stamp
> (the timestamp is updated each time the entry is used, the blob may be
> updated), one with a blob of binary data, and the third have the time
> stamp for the corresponding entry in the second table (updated each time
> the blob entry is used). The Python SQLite3 connection is started with a
> timeout of 100 seconds.
>
> The lock problem does not appear when seeding the database, but that
> process involves compiling the source first, so database updates should
> be less frequent.
>
> The problem occurs when the data is (mostly) only being pulled out of the
> database for reuse, followed by an update of the time stamp the rows  
> used (to be
> used when removing old entries).
>
> Using a locally built DLL with some printfs I found that it looks like  
> the
> problem occurs when winLock/winLockFile is called from sqlite3WalClose to
> obtain an exclusive lock; it seems to fail in the Exclusive lock part of
> the function. The reported windows error code is 33.
>
> For reference, the folder where the database is stored, is excluded from
> AV scans, the disk drives are all local SSDs. The size of the DB is
> currently about 4GB. The SQLite DLL version from Python 3.6 (can't use
> 3.7) is 3.21 (no change with 3.24).
>
> I have found a few "workarounds", most of them not remotely satisfactory:
>
> - Increase busy_timeout to at least 40 seconds, wal_autocheckpoint to
> 3000+ seems to work, although I have had failures in this scenario, too.
> The downside is that this takes (at least) as long as seeding the
> database, and 5-6 times as long as the system it is supposed to replace,
> and 10(!) times as long as the corresponding tests on Windows 10 (and 7
> times as long as on my own Win10 machine with half the cores).
>
> - Use defaults for busy_timeout, wal_autocheckpoint, and use a Windows
> mutex to lock access to the database during write operation. This *is*
> much faster than the previous example, but that is the best that can be
> said for it. It is still 10-20% slower than the old system, takes twice  
> as
> long as the Win 10 test (and 20-40% longer than tests on my own machine,
> with half the cores). Trying to use multiple mutexes only took longer.
>
> - I tried resuming the operation after a lock, that was also slower than
> both the old system and the Win 10 system, and it also ran into trouble
> when a repeated insert operation broke the uniqeness constraint (the
> transaction had completed and been committed before the database locked
> error occurred).
>
> - Reducing number of parallel jobs to 20 on the 32 (logical) core machine
> worked (24 failed almost at the end), but the reuse case still took close
> to 4 times as long as the old system (and 2/3 of the time used for a
> normal seeding using). OTOH, using 20 jobs on a 40 core Win 10 machine
> took only twice as long as using all cores (effectively what the old
> system used).
>
> - I also tried to rebuild the database without WAL. Seeding the database
> took 3 times longer than it did with WAL, and the second stage (reuse)
> seems to take as long (stopped it when it had taken longer than a normal
> seed, when it was just 25% finished).
>
> (And yes, upgrading to Win 10 may be a possibility, although at least two
> of the machines was not able to use Win 10 last time I tried.)
>
> I don't know if this is a problem caused by a problem in SQLite, or if it
> as limitation in Windows 7.
>
> That halving the number of jobs takes 4 times as long on Win 7 as the old
> job (which indicates that a all cores would still require double the time
> of the old system), while the Win 10 version was running just doubling  
> (as
> expected) the its original time, which was half of the old system's time,
> seems to indicate that there is a performance issue with SQLite on  
> Windows 7
> (and I don't see similar issues with other applications on these  
> machines).
> Additionally, since the *mutex* option actually got the run time within
> shouting range of the old system for the data reuse case, my guess is  
> that
> it is _not_ a file performance problem with Windows 7. OTOH there is  
> still a
> possibility that there is a Win 7 file locking performance issue, but I  
> doubt
> it.
>
> Does anyone have any suggestions for how to fix/work around this problem
> without losing too much (or any) efficiency?
>
> Thanks in advance.


--
Sincerely,
Yngve N. Pettersen
Vivaldi Technologies AS
Phone: +47 40 40 14 82
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Database locked problem on Windows 7

Richard Hipp-3
On 7/31/18, Yngve N. Pettersen <[hidden email]> wrote:
> I sent the email quoted below to the list, but
> unfortunately, as far as I can tell, I have so far received no response.
>

I think that means that nobody has an answer.  I don't have any idea
why your system would work well on Win10 but not on Win7.  SQLite
should work the same on both.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Database locked problem on Windows 7

Yngve N. Pettersen
On Tue, 31 Jul 2018 18:00:31 +0200, Richard Hipp <[hidden email]> wrote:

> On 7/31/18, Yngve N. Pettersen <[hidden email]> wrote:
>> I sent the email quoted below to the list, but
>> unfortunately, as far as I can tell, I have so far received no response.
>>
>
> I think that means that nobody has an answer.  I don't have any idea
> why your system would work well on Win10 but not on Win7.  SQLite
> should work the same on both.

I may actually just have found the root cause of the problem.

The distributed executable is built using pyinstaller for Python 3.6, and  
was built on Windows 10.  I started to wonder if this was relevant to the  
problem, and if building the executable on Windows 7 Pro would help fix  
the problem.

One of the main "irritants" of this system is that it is warning about  
some Windows API DLLs that are missing, and is required by Python 3.6. I  
have previously ignored these warnings.

A bit of digging indicates that these warnings may indeed be relevant to  
the problem. The DLLs are handled invisibly by Windows 10, but exists on  
pre-Win10 systems. See  
<https://github.com/pyinstaller/pyinstaller/issues/1566> and  
<https://github.com/pyinstaller/pyinstaller/commit/7876ec0f8727e97c60e2e71c42bb1688dd41623d>  
. One either have to build on Win7 to get these DLLs included, or take  
some advanced actions when building the executable.

I copied the whole source over to a Win 7 machine, and built it there, and  
had no problems running the full test without encountering any lock  
issues, the test was also slightly faster than before, even without the  
hacks I had used earlier.

IOW: AFAICT If a project using Python Sqlite3 needs to run on Windows 7  
and the executable is generated with Pyinstaller, then the easiest way to  
avoid issues is to generate the executable on a Windows 7 system.

This might conceivably apply to other kinds of projects, too, depending on  
the build environment.

--
Sincerely,
Yngve N. Pettersen
Vivaldi Technologies AS
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users