Corrupt index == disk full?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Corrupt index == disk full?

Ward Willats
Hello.

We have a DB with a corrupted index (see below). The database has a max_pages limit that works out to a 10 MB database size (we're embedded).

We can insert into the table the index is on OK.

When we do a SELECT out of the table that uses the index, we get a "disk or database full" error (and not a "db corrupted or not a db" error). Neither the DB nor the disk are actually full, however, max_page_count has been corrupted (should be 9756 with page size of 1024).

Does this seem...well....normal? I mean, I guess with corruption, all bets are off, but I'm curious as to what the mechanism might be that made a full error and not a corrupted error pop out.

Thanks

-- Ward

sqlite> pragma integrity_check;
*** in database main ***
On tree page 6053 cell 29: Failed to read ptrmap key=67699289
On tree page 6053 cell 29: invalid page number 67699289
On tree page 6053 cell 30: Failed to read ptrmap key=151148817
On tree page 6053 cell 30: invalid page number 151148817
Corruption detected in cell 29 on page 6053
Corruption detected in cell 30 on page 6053
Corruption detected in cell 31 on page 6053
Fragmentation of 33 bytes reported as 0 on page 6053
row 1283 missing from index iRareData
row 1284 missing from index iRareData
row 1285 missing from index iRareData
row 1286 missing from index iRareData
row 1287 missing from index iRareData


sqlite> pragma max_page_count;
1073741823
sqlite> pragma page_count;
8683
sqlite> pragma freelist_count;
1393
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt index == disk full?

Simon Slavin-3
On 10 May 2017, at 6:31pm, Ward WIllats <[hidden email]> wrote:

> I guess with corruption, all bets are off

I see your results from "pragma integrity_check".  As you write, "all bets are off".  

If your database is corrupt, any operation on it might silently lead to more corruption.  It may just happen that your INSERT command doesn’t do anything involving a contradiction in the file, whereas the SELECT command notices two things which contradict each other so is able to definitively report corruption.

You should reconstruct the database, then try again.  The simplest way to do this is to use the SQLite command-line tool to ".dump" the database to a SQL text file, quit the tool, restart the tool with a new database file, then ".read" the text file.

Once you’ve done that, check to see that the results are still smaller than available space (both disk space and max_pages, as you note), run "pragma integrity_check" again.  If it reports everything is okay you can proceed.

Of course, that doesn’t explain how the database got corrupted in the first place.  I assume you’ve read

<https://www.sqlite.org/howtocorrupt.html>

> Does this seem...well....normal? I mean, I guess with corruption, all bets are off, but I'm curious as to what the mechanism might be that made a full error and not a corrupted error pop out.

If SQLite tries to exceed max_pages is it meant to report an error and /not/ corrupt the database.  If you can reproduce a case where running out of space in a non-corrupt database causes corruption please tell us, since that’s a serious bug which will be fixed.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt index == disk full?

Ward Willats-2

> On May 10, 2017, at 10:49 AM, Simon Slavin <[hidden email]> wrote:
>
> On 10 May 2017, at 6:31pm, Ward WIllats <[hidden email]> wrote:
>
>> I guess with corruption, all bets are off
>
> I see your results from "pragma integrity_check".  As you write, "all bets are off".  

Thanks Simon. I'm not saying there is any kind of bug in SQLite, and, we've never been able to fully run down this corruption (not for lack of trying!) -- right now we just blow the DB away and start over when it is detected. (* hangs head in shame *)

But, we didn't do this in this case because a "disk full" error is not the CORRUPT or NOTADB error we expected to see to trigger the panic. So I was surprised, but...

Undefined behavior is indeed undefined, so anything can happen and that's OK. We'll deal.

Thanks again.

-- Ward


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt index == disk full?

Ward Willats
In reply to this post by Simon Slavin-3

> On May 10, 2017, at 10:49 AM, Simon Slavin <[hidden email]> wrote:
>
> On 10 May 2017, at 6:31pm, Ward WIllats <[hidden email]> wrote:
>
>> I guess with corruption, all bets are off
>
> I see your results from "pragma integrity_check".  As you write, "all bets are off".  

Thanks Simon. I'm not saying there is any kind of bug in SQLite, and, we've never been able to fully run down this corruption (not for lack of trying!) -- right now we just blow the DB away and start over when it is detected. (* hangs head in shame *)

But, we didn't do this in this case because a "disk full" error is not the CORRUPT or NOTADB error we expected to see to trigger the panic. So I was surprised, but...

Undefined behavior is indeed undefined, so anything can happen and that's OK. We'll deal.

Thanks again.

-- Ward


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt index == disk full?

Simon Slavin-3

On 10 May 2017, at 9:26pm, Ward WIllats <[hidden email]> wrote:

> But, we didn't do this in this case because a "disk full" error is not the CORRUPT or NOTADB error we expected to see to trigger the panic. So I was surprised, but...

I suspect that your initial post is correct, and the "disk full" error is not genuine, but instead caused by corruption in the file.  Perhaps by the erratic figure for max_page_count that you noted, or some other number associated with page management.

I’m concerned that this sort of corruption is happening repeatedly.  Whether or not there’s anything in howtocorrupt.html that rings any bells, you can rely on us to help figure out what’s wrong.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt index == disk full?

Ward Willats

> On May 10, 2017, at 2:20 PM, Simon Slavin <[hidden email]> wrote:

> I’m concerned that this sort of corruption is happening repeatedly.  Whether or not there’s anything in howtocorrupt.html that rings any bells, you can rely on us to help figure out what’s wrong.
>

Thanks. That's kind. It could well be memory corruption. We've had a hard time getting Valgrind up on this platform to chase. When we turn all the debug check stuff on in sqlite it doesn't seem to happen. (In fact, it doesn't happen often in any case.)

This use case on this particular DB is dead simple too -- one writer and one reader in different posix threads in the same process, fully serialized.

The unusual parts of our setup are:

1. The DB in questions is on a volatile filesystem and ATTACHED to another, smaller one residing on a flash (JFFS) filesystem, and
2. We use the shady SQLITE_SHM_DIRECTORY compile switch to move the .shm file off JFFS and into RAM so we can WAL, and
3. We're linking against an older version of the MUSL c library which has had bugs in the past on our platform and may have more. (We are moving to a new platform soon so will get to upgrade and isolate this variable.)

#2 is the one I always worry about, but generally it seems to work well.

-- Ward

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Corrupt index == disk full?

Richard Hipp-3
On 5/10/17, Ward WIllats <[hidden email]> wrote:

>
> The unusual parts of our setup are:
>
> 1. The DB in questions is on a volatile filesystem and ATTACHED to another,
> smaller one residing on a flash (JFFS) filesystem, and
> 2. We use the shady SQLITE_SHM_DIRECTORY compile switch to move the .shm
> file off JFFS and into RAM so we can WAL, and
> 3. We're linking against an older version of the MUSL c library which has
> had bugs in the past on our platform and may have more. (We are moving to a
> new platform soon so will get to upgrade and isolate this variable.)
>
> #2 is the one I always worry about, but generally it seems to work well.
>

I think I'd be more worried about the use of JFFS than I would #2.
JFFS did not have a very good reputation the last time I checked,
though it has been a while so perhaps that has changed.  Any reason
not to use EXT4?

Since the reader and writer threads are serialized, can you get by
using the same database connection for them both?  Do you need them to
be isolated from one another?

If you are able to use the same database connection, then you can set
"PRAGMA locking_mode=EXCLUSIVE" and SQLite will use heap-memory for
the SHM the SQLITE_SHM_DIRECTORY compile-time option will become
irrelevant.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users