LSM database file growth?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

LSM database file growth?

Charles Leifer
Is the LSM database append-only, as in the file size will always grow/never
shrink (even if there are deletions/overwrites)?

Thanks!
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LSM database file growth?

Dan Kennedy-4
On 10/31/2017 10:50 PM, Charles Leifer wrote:
> Is the LSM database append-only, as in the file size will always grow/never
> shrink (even if there are deletions/overwrites)?

An LSM database is basically a series of tree structures on disk. When
you write to an LSM database you add the new key to an in-memory tree.
Once that tree is full it is flushed out to disk. To query the database
for a key, you query each of these trees from newest to oldest until you
find a match or run out of trees. To prevent the number of trees on disk
from growing indefinitely, two or more old trees are periodically merged
together to create a single, larger tree structure.

A DELETE operation is handled like an INSERT, except a special
"delete-marker" key is added to the in-memory tree structure.

When two old trees are merged together, if a delete-marker is merged
with a real insert-key, both are discarded. So that the new tree
contains neither the delete-marker or the insert key.

So if you delete a bunch of data space is reclaimed eventually, but it
can take a while to happen. Optimizing the database involves merging all
trees on disk together, so this has the effect of reclaiming all unused
space:

http://sqlite.org/src4/doc/trunk/www/lsmusr.wiki#database_file_optimization

Dan.





_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: LSM database file growth?

Charles Leifer
Thanks so much, this explains things neatly. I was aware of the tree
compaction and the use of delete tombstones, but wasn't sure how it all
played out in terms of space reclamation. In other words, if the total size
of my keyspace is fixed, then the database won't grow without bounds, even
if keys within that space are frequently updated, deleted, or re-inserted.

On Tue, Oct 31, 2017 at 11:16 AM, Dan Kennedy <[hidden email]> wrote:

> On 10/31/2017 10:50 PM, Charles Leifer wrote:
>
>> Is the LSM database append-only, as in the file size will always
>> grow/never
>> shrink (even if there are deletions/overwrites)?
>>
>
> An LSM database is basically a series of tree structures on disk. When you
> write to an LSM database you add the new key to an in-memory tree. Once
> that tree is full it is flushed out to disk. To query the database for a
> key, you query each of these trees from newest to oldest until you find a
> match or run out of trees. To prevent the number of trees on disk from
> growing indefinitely, two or more old trees are periodically merged
> together to create a single, larger tree structure.
>
> A DELETE operation is handled like an INSERT, except a special
> "delete-marker" key is added to the in-memory tree structure.
>
> When two old trees are merged together, if a delete-marker is merged with
> a real insert-key, both are discarded. So that the new tree contains
> neither the delete-marker or the insert key.
>
> So if you delete a bunch of data space is reclaimed eventually, but it can
> take a while to happen. Optimizing the database involves merging all trees
> on disk together, so this has the effect of reclaiming all unused space:
>
> http://sqlite.org/src4/doc/trunk/www/lsmusr.wiki#database_
> file_optimization
>
> Dan.
>
>
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users