Coping with database growth/fragmentation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Coping with database growth/fragmentation

Taras Glek-3
  Hello,
Recently I spent some time investigating sqlite IO patterns in Mozilla.
Two issues came up: keeping sqlite files from getting fragmented and
fixing fragmented sqlite files.

First on fixing fragmentation:
Currently we write pretty heavily to our databases. This causes the
databases to grow, queries to slow down. VACUUM shrinks the databases
down to a reasonable size(among other things) and queries speed up
again, as expected. Problem is that databases get fragmented as they
grow. VACUUM only reduces filesystem fragmentation proportionally to
reduction in file size.
It seems like the most obvious solution is to do some sort of a hot copy
and switch to a new database connection to the copy. It would be nice if
sqlite could do this by default.
Is process described in http://www.sqlite.org/backup.html the best way
to accomplish this?


Avoiding Fragmentation:
Writing to the db file causes fragmentation when the underlying
filesystem fails to anticipate the sqlite growth pattern. Seems like the
easiest fix here is to add a pragma fs_allocation_size. It would
preallocate a continuous chunk of diskspace. Sqlite would behave exactly
as it does now, except it would avoid truncating the file beyond a
multiple of the fs_allocation_size.
For example, pragma fs_allocation_size=50M would grow the db file to
50megabytes. Once the db grows to beyond 50mb the underlying file would
get resized to 100mb.
Clearly one can implement something like this via VFS handlers, but it
would be nice to do this on a slightly higher level. Where in the code
would this be appropriate? It would be great to get some help so I could
submit a patch for this.

I am pretty new to sqlite, so I could be on the wrong track here. Would
like to see an expert opinion on these two approaches.

There is also approach #3 of running a defrag tool to fix this, but I'd
rather fix the cause of fragmentation.

Thanks,
Taras

ps. For details see my blog post on fragmentation:
http://blog.mozilla.com/tglek/2010/07/22/file-fragmentation/
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Coping with database growth/fragmentation

Martin Engelschalk
  Hello Taras, List,

I have been fighting the same problems described here for a long time,
and have no real elegant solution. So, the proposed solution of the OP
below would be ideal for me too.
The proposed pragma could also define a number of pages to be allocated
at once instead of a number of bytes.

In my case, the database grows continously and the file is often
extremely fragmented when the growth phase is finished (this concerns
the file on the disk, not internal fragmentation)

Currently, i monitor the size of the database using pragma
freelist_count. When I see the value of free pages approach zero, i
create a dummy table with a blob field and fill it with a very large
empty blob. Then i drop the table. The empty pages remain behind and
page_count does not rise any more for a time.
This has been proposed to me on this list a while ago.

However, testing the database in this way and creating and dropping the
table carries a performance penalty, and finding the strategic places in
my application to do this has been difficult.

Martin


Am 23.07.2010 03:11, schrieb Taras Glek:

<snip>
>   Seems like the
> easiest fix here is to add a pragma fs_allocation_size. It would
> preallocate a continuous chunk of diskspace. Sqlite would behave exactly
> as it does now, except it would avoid truncating the file beyond a
> multiple of the fs_allocation_size.
> For example, pragma fs_allocation_size=50M would grow the db file to
> 50megabytes. Once the db grows to beyond 50mb the underlying file would
> get resized to 100mb.
<snip>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Coping with database growth/fragmentation

Simon Slavin-3
In reply to this post by Taras Glek-3

On 23 Jul 2010, at 2:11am, Taras Glek wrote:

> Recently I spent some time investigating sqlite IO patterns in Mozilla.
> Two issues came up: keeping sqlite files from getting fragmented and
> fixing fragmented sqlite files.

If I understand correctly, there are two levels of fragmentation involved: disk fragmentation (where the sectors of the disk file are spread about) and database fragmentation (where the pages of database information are spread about the disk file).

> First on fixing fragmentation:
> Currently we write pretty heavily to our databases. This causes the
> databases to grow, queries to slow down.

Can I check that you are not just seeing fragmentation, but are actually seeing performance vary with fragmentation ?  Because having that happen to an extent that's noticeable is something traditionally associated only with Windows, and your blog entry says you're using ext4 on Linux.  Other platforms and file systems /have/ fragmentation, of course, but it doesn't normally slow them down as much as fragmentation slows down Windows.

Some platforms handle this in unexpected ways.  For instance, OS X will automatically defragment files smaller than 20MB each time they're opened.  It won't defragment the database pages because, of course, it doesn't understand SQLite format.

The easiest way to make a defragmented copy of a SQLite file would be to use the command-line tool to .dump a copy of a database to a text file, then again to .read that textfile into a database.  Under Unix you can do it in one command:

sqlite3 old_database.sqlite .dump | sqlite3 new_database.sqlite

The resulting SQLite database file will not only be defragmented but will have some other optimal characteristics.  I would be interested to know if you really do see performance improvements by doing this then replacing old_database.sqlite with new_database.sqlite .

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Coping with database growth/fragmentation

Taras Glek-3
In reply to this post by Martin Engelschalk
On 07/23/2010 04:38 AM, Martin Engelschalk wrote:

>    Hello Taras, List,
>
> I have been fighting the same problems described here for a long time,
> and have no real elegant solution. So, the proposed solution of the OP
> below would be ideal for me too.
> The proposed pragma could also define a number of pages to be allocated
> at once instead of a number of bytes.
>
> In my case, the database grows continously and the file is often
> extremely fragmented when the growth phase is finished (this concerns
> the file on the disk, not internal fragmentation)
>
> Currently, i monitor the size of the database using pragma
> freelist_count. When I see the value of free pages approach zero, i
> create a dummy table with a blob field and fill it with a very large
> empty blob. Then i drop the table. The empty pages remain behind and
> page_count does not rise any more for a time.
> This has been proposed to me on this list a while ago.
>
> However, testing the database in this way and creating and dropping the
> table carries a performance penalty, and finding the strategic places in
> my application to do this has been difficult.

Yeah sounds like the same problem. Interesting workaround.

Here is my "fix". https://bugzilla.mozilla.org/show_bug.cgi?id=581606

This dramatically reduces fragmentation for append-only workloads.

Taras
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Coping with database growth/fragmentation

Max Vlasov
In reply to this post by Taras Glek-3
On Fri, Jul 23, 2010 at 5:11 AM, Taras Glek <[hidden email]> wrote:

>  Hello,
> Recently I spent some time investigating sqlite IO patterns in Mozilla.
> Two issues came up: keeping sqlite files from getting fragmented and
> fixing fragmented sqlite files.
>
>
Funny, that's why I like reading someone's questions in this list. It helps
sometimes solving old tasks :). As many noticed Windows system cache is hard
to control. So for example it almost impossible to clear reading cache for
testing purposes, once you read the file, it's in the cache so the following
timing numbers are irrelevant. There's an option for CreateFile,
FILE_FLAG_NO_BUFFERING, it can be used to disable the cache for a file when
one wants to work with it. I thought maybe to change the sources and prepare
a special version of sqlite allowing to open without cache. But a better
solution at least on XP came, if I "touch" a file with
CreateFile(..FILE_FLAG_NO_BUFFERING) and close it, it won't use the cache
for the next file opening, so for testing purposes I just made a checkbox in
an admin that "touches" the file before passing it to sqlite. And it seems
it really works.

So, Taras, thank for your post )

Also with this approach I tried to test places.sqlite, particularly
moz_places table, the query was

SELECT * FROM moz_places WHERE url Like "%double%"

I suppose that mozilla team probably uses different queries, but since there
are no fts table recognizable, there should be some kind of full-scan.

So, my tests on two hard drives showed that windows fragmentation had small
effect on the performance of the query, while VACUUM; results had
significant. Before Vacuum, my long time places.sqlite 13M in size, having
moz_places with 16893 records, return results after 8-10 seconds, depending
on the place it lived, but after VACUUM, the results were between 150ms and
300ms.

I think that this can be due to the nature of windows read ahead cache, so
when sqlite tables are placed in consequent blocks of file, Windows loads
the same pages sqlite expects. So before any file system defragmentation,
internal sqlite defragmentation (VACUUM) have to be applied.

Max Vlasov,
maxerist.net
_______________________________________________
sqlite-users mailing list
[hidden email]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users