mmap, madvise, mlock and performance

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

mmap, madvise, mlock and performance

Shevek
Hi,

We are running a 100Gb sqlite database, which we mmap entirely into RAM.
  We are having trouble with parts of the disk file being evicted from
RAM during periods of low activity causing slow responses, particularly
before 9am. Has anybody played with mlock and/or madvise within the
sqlite mmap subsystem to improve this behaviour?

The system has a few hundred gig of RAM, no swap, the database is
read-only, and we would prefer a page-out to a process crash, so mlock
might not be ideal, but madvise might not be strong enough?

Thank you.

S.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: mmap, madvise, mlock and performance

Simon Slavin-3
On 3 Aug 2018, at 8:36pm, Shevek <[hidden email]> wrote:

> We are running a 100Gb sqlite database, which we mmap entirely into RAM. We are having trouble with parts of the disk file being evicted from RAM during periods of low activity causing slow responses, particularly before 9am. Has anybody played with mlock and/or madvise within the sqlite mmap subsystem to improve this behaviour?

Is this a genuine Linux machine running on physical hardware, or is it a virtual machine ?

Are you intentionally doing anything that would contend for this memory ?  In other words, when a memory-mapped portion gets swapped out, does it make sense what replaced it, or is it pointless and weird ?

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: mmap, madvise, mlock and performance

Warren Young
In reply to this post by Shevek
On Aug 3, 2018, at 1:36 PM, Shevek <[hidden email]> wrote:
>
> the database is read-only

In that case, I’d just create a :memory: DB on application startup, attach to the disk copy, use the INSERT FROM … SELECT pattern [1] to clone the data content within a single transaction, create the indexes, and detach from the on-disk copy of the DB.

[1]: https://stackoverflow.com/a/4291203
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: mmap, madvise, mlock and performance

Shevek
In reply to this post by Simon Slavin-3
On 08/03/2018 12:55 PM, Simon Slavin wrote:
> On 3 Aug 2018, at 8:36pm, Shevek <[hidden email]> wrote:
>
>> We are running a 100Gb sqlite database, which we mmap entirely into RAM. We are having trouble with parts of the disk file being evicted from RAM during periods of low activity causing slow responses, particularly before 9am. Has anybody played with mlock and/or madvise within the sqlite mmap subsystem to improve this behaviour?
>
> Is this a genuine Linux machine running on physical hardware, or is it a virtual machine ?

Yes, it's a genuine physical, we have Xeon and Epyc CPUs available.
Sometimes we have to run in VMs (up to 50Gb), but the bigger stuff is
all physical. We typically have 256Gb+ of RAM, so we aren't under
particular pressure to mmap a 100Gb database.

> Are you intentionally doing anything that would contend for this memory ?  In other words, when a memory-mapped portion gets swapped out, does it make sense what replaced it, or is it pointless and weird ?

Sometimes, Linux just seems to get unfriendly with a set of pages and
just maps them out. I've been watching it all weekend - right now the
system I'm watching has 165Gb free, and I watched Linux just dump 40Gb
out of RAM. :-( There are other jobs running on the system, and doing
I/O, but nothing that should put any real memory pressure on the system,
aside from disk I/O, backup, etc.

We're about to try mlockall(MCL_FUTURE) along with MAP_SHARED. It might
also be worth trying fadvise(), but I think kernel only honours a few
megabytes based on that. We did think of a page-toucher thread but that
risks thrashing as much as anything, but might be interesting for
monitoring page faults performance.

Later note: mlockall() failed because of JVM heap; we're going to have
to do something much more specific, like holding a secondary map and
mlocking that.

Warren:

Is the copy-everything-into-memory strategy not prohibitively expensive
at the 100+Gb scale? Is it worth sinking the time into implementing
that? Our rows are very small, only a few bytes each, so the per-row
overhead may be significant. Also, it would be nice to have a shared
mmap, rather than entirely private RAM, so we can run experiments over
the shared (readonly) store.

S.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users