SQLITE_MAX_MMAP_SIZE 2GB default

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

SQLITE_MAX_MMAP_SIZE 2GB default

Carl Edquist
Hi there,

My apologies if this has been answered in the past, but is there a reason
for the ~2GB default for SQLITE_MAX_MMAP_SIZE ?

  #ifndef SQLITE_MAX_MMAP_SIZE
  ...
  #   define SQLITE_MAX_MMAP_SIZE 0x7fff0000  /* 2147418112 */

It makes sense for 32bit platforms, but for 64bit with larger databases
(where 2GB ends up being a small fraction of the db file), the default max
effectively takes away the benefit of the feature.

I can compile my own binary of course, but it would be convenient to have
it supported out of the box.

Just wondering if there is a reason not to increase the max for 64bit
builds (to some large value like 1TB) ?  Especially given that that the
default mmap size is much lower in any case until it is increased (with
PRAGMA mmap_size).


Thanks..!

Carl
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLITE_MAX_MMAP_SIZE 2GB default

Jens Alfke-2


> On Apr 18, 2019, at 7:33 PM, Carl Edquist <[hidden email]> wrote:
>
> Just wondering if there is a reason not to increase the max for 64bit builds (to some large value like 1TB) ?  Especially given that that the default mmap size is much lower in any case until it is increased (with PRAGMA mmap_size).

2GB still still seems reasonable as a default, since SQLite is not primarily targeted at servers with oodles of RAM.
(Remember, "64-bit" includes nearly all current mobile phones/tablets, which still have small amounts of RAM, 2GB or less.)
You can easily override the default by #defining a higher value yourself, right?

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLITE_MAX_MMAP_SIZE 2GB default

Carl Edquist
Hi Jens, thanks for your reply.

> 2GB still still seems reasonable as a default, since SQLite is not
> primarily targeted at servers with oodles of RAM. (Remember, "64-bit"
> includes nearly all current mobile phones/tablets, which still have
> small amounts of RAM, 2GB or less.)

My comment about 32bit vs 64bit wasn't really about the ram available on
the system, just about integer sizes and the address space available for
an mmap.

For instance - if you have a 30GB db file on a 64bit system with <= 2GB
ram, you can still mmap the whole file, and benefit from that mmap.  If
the portion of the db that gets used for a query fits within the available
pagecache ram, it's a clear win.  (It's not like the whole file
automatically gets read from disk into the pagecache for the mmap.)

But even if the whole file is used for the query (that is, more than fits
into pagecache/ram), it still has the benefit of avoiding the system calls
for the file seeks/reads.  (Either way the kernel needs to swap disk pages
into/out of of the pagecache.)

What's more, (if I understand correctly) the default
SQLITE_DEFAULT_MMAP_SIZE is zero on all platforms, so mmap_size starts at
0 unless raised with the PRAGMA at runtime.  So applications need to
explicitly request when they want to do mmaps of a particular size.  So my
question is, at least on on 64bit platforms, why is the default to
*prevent* manually raising mmap_size past 2GB at runtime?

I guess I'm curious what the rationale is for this hard limit.


> You can easily override the default by #defining a higher value
> yourself, right?

Yes, of course, I compile my own now, defining a higher
SQLITE_MAX_MMAP_SIZE.  But when I want to run an application (eg, the
sqlite3 command line util or or the python library) across many systems
that I don't control, it would be convenient if I could use the stock
versions installed rather than having to compile my own everywhere.

Again the question I'm curious about is, why should users have to compile
their own versions just to be able to raise PRAGMA mmap_size at runtime?


Thanks..!

Carl
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLITE_MAX_MMAP_SIZE 2GB default

Jens Alfke-2


> On Apr 19, 2019, at 12:46 PM, Carl Edquist <[hidden email]> wrote:
>
> For instance - if you have a 30GB db file on a 64bit system with <= 2GB ram, you can still mmap the whole file, and benefit from that mmap.  If the portion of the db that gets used for a query fits within the available pagecache ram, it's a clear win.  (It's not like the whole file automatically gets read from disk into the pagecache for the mmap.)

Oops, you’re right. I somehow lost sight of the obvious when replying…

> But even if the whole file is used for the query (that is, more than fits into pagecache/ram), it still has the benefit of avoiding the system calls for the file seeks/reads.  (Either way the kernel needs to swap disk pages into/out of of the pagecache.)

Sort of. Most current OSs have a universal buffer cache, wherein filesystem caches and VM pages use the same RAM buffers. A page-fault and a file read will incur similar amounts of work. The big benefit is that the memory-mapped pages can be evicted from RAM when needed for other stuff, whereas a malloc-ed page cache is considered dirty and has to be swapped out before the RAM page can be reused. (That said, I am not a kernel or filesystem guru so I am probably oversimplifying.)

But yeah, I agree with you that it seems odd to have a compiled-in restriction on the maximum memory-map size.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLITE_MAX_MMAP_SIZE 2GB default

Carl Edquist
Thanks Jens!

On Mon, 22 Apr 2019, Jens Alfke wrote:

> But yeah, I agree with you that it seems odd to have a compiled-in
> restriction on the maximum memory-map size.

I looked a bit into the history, and it appears the just-under-2GB limit
was specifically put there (by drh) so that the value would fit into a
signed 32-bit integer:

  https://www.sqlite.org/src/info/460752b857532016

Maybe this was in order to fit into a ssize_t on 32-bit platforms?  (Just
a guess as the size_t type is used for mmap.)

If that's the reason, would drh & the sqlite team consider changing the
default SQLITE_MAX_MMAP_SIZE definition from 0x7fff0000 to SSIZE_MAX
(defined in limits.h) ?


...


Somewhat of an aside:

> Most current OSs have a universal buffer cache, wherein filesystem
> caches and VM pages use the same RAM buffers. A page-fault and a file
> read will incur similar amounts of work. The big benefit is that the
> memory-mapped pages can be evicted from RAM when needed for other stuff,
> whereas a malloc-ed page cache is considered dirty and has to be swapped
> out before the RAM page can be reused.

Right, i think we're agreeing here.  I just mean that if a db file is much
larger than the available ram, and all of it is used for a query, then
whether bytes come in via mmap+memcpy() or regular file read(), the kernel
will page-in missing pages into its page cache, and (at its discretion) it
will eventually have to evict those pages to make room for others.  (I
didn't mean to make reference to disk swap or sqlite's user-space
pagecache.)  The point is that the kernel page loading/evicting part is
the same, but the mmap also saves the overhead for the seek/read system
calls.

And i say this from my personal experience with sqlite -- when doing a
long running join on a db that does not fit into ram, with the timer on, i
observed that in the regular non-mmap mode, the system time for the query
would be about equal to the user time.  But when i mmap'ed the whole file,
the system time for the query was < 1% of the user time, and the user time
was also less than it had been in non-mmap mode.  I can only guess the
huge difference in the system time was mainly the kernel handling all the
seek/read system calls, vs just memcpy when the whole db file is mmap'ed.

The bottom line in any case is i saw a substantial speedup for long
queries when mmap'ing a large db, even when it did not fit into ram.


Thanks..!

Carl
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users