Sudden error 26 / 11

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Sudden error 26 / 11

Jim Borden
I’m having an issue with a library I am writing.  This has not happened before in the 1 ½ years I have been developing the library, but for some reason now it’s rearing its ugly head from time to time.  The library will be happily running along and then suddenly a SELECT statement will return error code 26 upon step.  After a while of this, it will change to error 11 instead.  The database has become corrupted and the integrity check reports a different kind of problem with a different page each time.  This seems to happen more often on Windows tablets.  I have only observed the behavior in real time once (and it was on a VM) but logs sent from a client have shown this behavior quite a few times.  Mostly they are using SQLCipher based on SQLite 3.8.10.2, but I observed the behavior with regular SQLite 3.12.2

The library manages logical databases, which contain a few backends (one being SQLite).  Each logical database contains two connections to an SQLite database.  A read only connection that is used for reads outside the library, and a read / write connection used for writing and internal reads during transactions.  The read only connection has full mutex enabled (though I’m not sure if this is necessary) so that I can share it between threads without caring.  The RW connection is pumped through a queue on a single thread and all other threads must wait their turn until their job is finished.  I confirmed that no writes are taking place outside of this thread.  This connection is in “multi thread” mode I think (compiled with thread safety on but opened with NO_MUTEX).

The library caches and manages logical databases so whenever a consuming app requests one they will either get a new one if one hasn’t been made or the same one if it has.  This ensures that only one logical database is ever interacting with the sqlite connection handles of an sqlite database.  The logs from the client and my own observations have backed this up.  

From what I have read, error 11 is extremely hard to cause through library usage alone.  The key offences seem to be:

1) Using two versions of SQLite at once in an application
2) Bad OS file locking
3) A rogue process writing garbage into the file

The library is written in C#, and it’s true that two versions of SQLite can be used with it (regular SQLite or SQLCipher), however I have limited this so once the user has chosen the implementation they want to use they are not allowed to change it after that.  The logs and my observations have shown that only one version of the storage engine is constructed (each storage engine binds with a native library via P/Invoke).  

My ultimate question is:  How can I figure out where the corruption is happening when it does?  I’ve learned quite a bit about the actual file format of SQLite and spent a number of hours pouring through the SQLite files I got from the client with a hex editor looking for signs of bad things getting written but in the end that was a bit difficult for me.  The thing is, by the time the symptoms show it is already past when the corruption occurred from what I gather.  There are no network file systems involved in this, it’s purely local.  Are there any things I should be on the look out for?  I’d appreciate any advice.  


--Jim

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Sudden error 26 / 11

Simon Slavin-3

On 23 Jun 2016, at 11:13pm, Jim Borden <[hidden email]> wrote:

> From what I have read, error 11 is extremely hard to cause through library usage alone.  The key offences seem to be:
>
> 1) Using two versions of SQLite at once in an application
> 2) Bad OS file locking
> 3) A rogue process writing garbage into the file

Two possibilities missing from the above list:

4) A rogue process writing garbage into memory SQLite thinks it owns
5) Hardware fault

Problem (1) is not quite right.  Apart from a couple of very obscure versions here's no problem with different versions of SQLite.  The problem is for one application to be using two SQLite libraries at once, even if they are the same version.  It's difficult to do this accidentally.


As for investigating the problem, I can only suggest something drastic:

Run "PRAGMA quick_check" after every command which makes any change to the database and trigger an immediate crash if it fails.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Sudden error 26 / 11

Richard Hipp-3
In reply to this post by Jim Borden
On Thu, Jun 23, 2016 at 6:13 PM, Jim Borden <[hidden email]> wrote:
> The library will be happily running along and then suddenly a SELECT statement will return error code 26 upon step.

Error code 26 is SQLITE_NOTADB.  That only happens when SQLite is
reading the 100-byte header at the beginning of the database file and
finds that header is correct.  This is given a discinct error code
because usually that finding means that you are trying to open
something that is not really an SQLite database file.

Debugging hints:

(1) Set error logging on your application:  https://www.sqlite.org/errlog.html

(2) Compile the "showdb" utility program (using "make showdb") and
then run "showdb DATABASE dbheader" on a corrupt database file.  Send
the results to this list.  (There are many other useful run-time
options for "showdb" that you might also find useful.)


>
>
> This connection is in “multi thread” mode I think (compiled with thread safety on but opened with NO_MUTEX).

Does the problem go away if you se the threading mode to "serialized"?

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Sudden error 26 / 11

Jim Borden
Hello all,

Sorry for the delay.  It’s really hard to get this error to happen but I have some more relevant information.

I observed a read only connection (there are two connections open to the DB, one for reading only and one for writing / internal reading) suddenly start returning error code 26, and then error code 11 for a query so I pulled the database file.  It’s encrypted with SQLCipher, but I opened it with the sqlcipher command line and ran PRAGMA quick_check.  Normally when a corruption occurs this will show me what happened, but in this case it returned “ok.”  However, the query results never ended.  I killed the process when it passed 2 GB of output from a 6 MB database from a SELECT query.  This seems like corruption that the PRAGMA missed.  SELECT

One thing to note that I found was at one point during the testing today I found that sqlite3_prepare_v2 crashed while attempting to access invalid memory.  This led me to research prepared statements and find that operations on them are not actually thread safe.  So I will be changing the write connection mode to serialized and testing further but I wanted to report on the odd behavior I found above.  

I’m having trouble getting any output from the sqlite error log since this is a C# application.  I have written a P Invoke binding to the method but I’m unsure if it is succeeding or not.  How often should I expect output from it?  Is there a way I can force it to output something to test it?

As for the second suggestion, I am able to open the database with the sqlcipher command line so I assume that the database header is intact.  There were two instances of reported corruption today (perhaps three if I interpret the invalid memory access as the same problem showing a different result).  One showed “bTreeInitPage() returned error code 11” for some page in the 5000 range (this is usually the error I get, with various pages being listed as the bad ones between runs) and the other showed “ok” but had the never ending SELECT output on certain queries.

Please let me know if anything comes to mind.  

Jim Borden
Software Engineer

[hidden email]

On 6/24/16, 4:25 AM, "[hidden email] on behalf of Richard Hipp" <[hidden email] on behalf of [hidden email]> wrote:

>On Thu, Jun 23, 2016 at 6:13 PM, Jim Borden <[hidden email]> wrote:
>> The library will be happily running along and then suddenly a SELECT statement will return error code 26 upon step.
>
>Error code 26 is SQLITE_NOTADB.  That only happens when SQLite is
>reading the 100-byte header at the beginning of the database file and
>finds that header is correct.  This is given a discinct error code
>because usually that finding means that you are trying to open
>something that is not really an SQLite database file.
>
>Debugging hints:
>
>(1) Set error logging on your application:  https://www.sqlite.org/errlog.html
>
>(2) Compile the "showdb" utility program (using "make showdb") and
>then run "showdb DATABASE dbheader" on a corrupt database file.  Send
>the results to this list.  (There are many other useful run-time
>options for "showdb" that you might also find useful.)
>
>
>>
>>
>> This connection is in “multi thread” mode I think (compiled with thread safety on but opened with NO_MUTEX).
>
>Does the problem go away if you se the threading mode to "serialized"?
>
>--
>D. Richard Hipp
>[hidden email]
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Sudden error 26 / 11

Simon Slavin-3

On 30 Jun 2016, at 3:11pm, Jim Borden <[hidden email]> wrote:

> There were two instances of reported corruption today

Are you using any pragmas ?  Are you using threads ?  Are you using more than one process in one program ?

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Sudden error 26 / 11

Jim Borden
The pragmas I am using are just journal mode WAL and the add on pragma from sqlcipher to set the key.  I am using threads fairly heavily (or rather C# is via Tasks and asynchronous ops), but here is the overall model:

Read connection (read only) shared between threads freely.  I can pump this through the single thread as well, but it will still suffer from an error in prepared statement handling (see below on write connection)
Write connection (also internal reads) pumped through a single thread for writes.  However, I realized that prepared statements were only being created on the special thread but being executed on others.

I read something that indicated that SQLite is never safe to be used from two threads at once, but it was labeled as outdated and seems to be contradicted by the newer information.

I am only using one process in the program.

Jim Borden
Software Engineer

[hidden email]

On 6/30/16, 6:26 PM, "[hidden email] on behalf of Simon Slavin" <[hidden email] on behalf of [hidden email]> wrote:

>
>On 30 Jun 2016, at 3:11pm, Jim Borden <[hidden email]> wrote:
>
>> There were two instances of reported corruption today
>
>Are you using any pragmas ?  Are you using threads ?  Are you using more than one process in one program ?
>
>Simon.
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Sudden error 26 / 11

Jim Borden
Hey all,

As a follow up, the suggestion to change the write connection from multithreaded to serialized mode seems to have done the trick.  We set up some automated testing and we now have approximately 300 runs in a row with no corruption (as a control, with the multithreaded mode the database corrupted approximately 19 times in 147 runs).

Jim Borden
Software Engineer

[hidden email]

On 6/30/16, 9:38 PM, "[hidden email] on behalf of Jim Borden" <[hidden email] on behalf of [hidden email]> wrote:

>The pragmas I am using are just journal mode WAL and the add on pragma from sqlcipher to set the key.  I am using threads fairly heavily (or rather C# is via Tasks and asynchronous ops), but here is the overall model:
>
>Read connection (read only) shared between threads freely.  I can pump this through the single thread as well, but it will still suffer from an error in prepared statement handling (see below on write connection)
>Write connection (also internal reads) pumped through a single thread for writes.  However, I realized that prepared statements were only being created on the special thread but being executed on others.
>
>I read something that indicated that SQLite is never safe to be used from two threads at once, but it was labeled as outdated and seems to be contradicted by the newer information.
>
>I am only using one process in the program.
>
>Jim Borden
>Software Engineer
>
>[hidden email]
>
>On 6/30/16, 6:26 PM, "[hidden email] on behalf of Simon Slavin" <[hidden email] on behalf of [hidden email]> wrote:
>
>>
>>On 30 Jun 2016, at 3:11pm, Jim Borden <[hidden email]> wrote:
>>
>>> There were two instances of reported corruption today
>>
>>Are you using any pragmas ?  Are you using threads ?  Are you using more than one process in one program ?
>>
>>Simon.
>>_______________________________________________
>>sqlite-users mailing list
>>[hidden email]
>>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Sudden error 26 / 11

LincolnBurrows
This post has NOT been accepted by the mailing list yet.
In reply to this post by Jim Borden
There is another post that i have found suitable for resolving sqlite error code 11