SQLite in memory

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

SQLite in memory

Gabriele Lanaro
Hi, I'm trying to assess if the performance of my application is dependent
on disk access from sqlite.

To rule this out I wanted to make sure that the SQLite DB is completely
accessed from memory and there are no disk accesses.

Is it possible to obtain this effect by using pragmas such as cache_size?

Another solution is to copy the existing db to a :memory: db but I'd like
to achieve the same effect without doing so (because it will require
substantial modification of the application). For the sake of argument,
let's image that using :memory: db is not an option.

Also using a ramdisk is not an option because I don't have root access to
the machine.

Thanks,

Gabriele
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite in memory

petern
From the SQLite shell (CLI), have you tried dot commands ".backup" to file
and ".restore" to a new :memory: DB?  That assumes a few things like access
to the filesystem and sufficient user memory quota to hold the disk version
of the DB.  Does that work?

The shell dot commands and their syntax is explained here:
https://sqlite.org/cli.html



On Wed, May 17, 2017 at 10:18 PM, Gabriele Lanaro <[hidden email]
> wrote:

> Hi, I'm trying to assess if the performance of my application is dependent
> on disk access from sqlite.
>
> To rule this out I wanted to make sure that the SQLite DB is completely
> accessed from memory and there are no disk accesses.
>
> Is it possible to obtain this effect by using pragmas such as cache_size?
>
> Another solution is to copy the existing db to a :memory: db but I'd like
> to achieve the same effect without doing so (because it will require
> substantial modification of the application). For the sake of argument,
> let's image that using :memory: db is not an option.
>
> Also using a ramdisk is not an option because I don't have root access to
> the machine.
>
> Thanks,
>
> Gabriele
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite in memory

Paul
In reply to this post by Gabriele Lanaro
If by any chance you have access to Linux or alike, you can just mount a ramfs and move database file over there.
It is a usual file system that lives in RAM. This will 100% guarantee you that no disk access will be made by SQLite.


18 May 2017, 08:18:47, by "Gabriele Lanaro" <[hidden email]>:
 

>   Hi, I'm trying to assess if the performance of my application is dependent
> on disk access from sqlite.
>
> To rule this out I wanted to make sure that the SQLite DB is completely
> accessed from memory and there are no disk accesses.
>
> Is it possible to obtain this effect by using pragmas such as cache_size?
>
> Another solution is to copy the existing db to a :memory: db but I'd like
> to achieve the same effect without doing so (because it will require
> substantial modification of the application). For the sake of argument,
> let's image that using :memory: db is not an option.
>
> Also using a ramdisk is not an option because I don't have root access to
> the machine.
>
> Thanks,
>
> Gabriele
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite in memory

Keith Medcalf
In reply to this post by Gabriele Lanaro
On Wednesday, 17 May, 2017 23:18, Gabriele Lanaro <[hidden email]> wrote:

> Hi, I'm trying to assess if the performance of my application is dependent
> on disk access from sqlite.

Of course it is.  Depending on what your application is doing.
 
> To rule this out I wanted to make sure that the SQLite DB is completely
> accessed from memory and there are no disk accesses.

You still haven't stated why you think disk access may be the bottleneck and what sort of operations that you are doing that you think are problematic.
 
> Is it possible to obtain this effect by using pragmas such as cache_size?

Yes and No.  It depends what kind of operations you are doing.  For example, assuming you have enough RAM then once pre-charged even a modern crappy filesystem cache will short-circuit I/O reads so that they do not hit the disk.  Nothing can be done for writes though (except that the old timer adage that the fastest way to do I/O is not to do it, applies in spades).
 
> Another solution is to copy the existing db to a :memory: db but I'd like
> to achieve the same effect without doing so (because it will require
> substantial modification of the application). For the sake of argument,
> let's image that using :memory: db is not an option.

Ok.
 
> Also using a ramdisk is not an option because I don't have root access to
> the machine.

That would imply that the machine has a concept of "root", which means that it is not Windows, MVS, or another operating system that does not call that level of access "root" access.

Perhaps you can provide some actually useful information such as:  the OS you are using, what type of operations you think are impacted (made slow) by I/O, whether or not the I/O channel "gets full" (ie, is the bottleneck), whether the CPU is fully consumed.  Etc.

And yes, Disk I/O makes a huge difference.  I have things that run on a "real computer (my laptop)" with an SSD hard drive that does I/O at 2.5 GB/s (yes, those are big-B Bytes and big G american billions) and has a significant sized "block" cache (as opposed to the modern preponderance of mostly useless filesystem caches) which can run a workload in 1/250th of the time taken compared to running the exact same workload on a typical "server class" machine configured with a high-speed hardware raid attached spinning rust.  The "server class" hardware has significantly better CPU, but the bottleneck (in this particular case) is apparently I/O.

Have you identified the "parts you think are slow" and instrumented them to see what is happening?  Although the SQLite query optimizer is very good, it does sometimes make incorrect or unnecessary optimizations that can severely impact performance.  Not to mention that there are a lot of other factors in play such as OS, filesystem, CPU, RAM, Virtualization, etc.






_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite in memory

Eduardo Morras-2
In reply to this post by Gabriele Lanaro
On Wed, 17 May 2017 22:18:19 -0700
Gabriele Lanaro <[hidden email]> wrote:

> Hi, I'm trying to assess if the performance of my application is
> dependent on disk access from sqlite.
>
> To rule this out I wanted to make sure that the SQLite DB is
> completely accessed from memory and there are no disk accesses.
>
> Is it possible to obtain this effect by using pragmas such as
> cache_size?
>
> Another solution is to copy the existing db to a :memory: db but I'd
> like to achieve the same effect without doing so (because it will
> require substantial modification of the application). For the sake of
> argument, let's image that using :memory: db is not an option.
>
> Also using a ramdisk is not an option because I don't have root
> access to the machine.

What OS are you using?

You can next tips to make the app less dependant on disk I/O access:

a) change where store temporal tables (mat views, subqueries, temp
tables) and indices (transient, to use ram always (pragma
temp_store=2),
 
b) increase cache size, the more, the better (os disk cache is shared
with other processes and is slower), if cache is equal or bigger than
your db, it'll be fit in ram,

c) set config SQLITE_CONFIG_STMTJRNL_SPILL=-1 if you use statement
journal files (check http://www.sqlite.org/tempfiles.html#stmtjrnl),

d) use wal mode to avoid *-shm files, and set wal_checkpointing,

e) use mmap_size pragma to minimize I/O (check
http://www.sqlite.org/mmap.html it has disadvanteges too)


> Thanks,
>
> Gabriele

HTH

---   ---
Eduardo Morras <[hidden email]>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite in memory

Gabriele Lanaro
In reply to this post by Gabriele Lanaro
Thanks everyone for all the tips! This is all very useful.

We are using SQLite’s FTS5 feature to search a large number of text files.
There are 50M records in total but they are split across 1000 smaller
databases of 50K records each. Each DB is 250MB in size.

I am trying to test query performance and for that I am issuing queries for
same term over and over on a random subset of the databases.

Each query will execute on its own thread (I am using Python’s CherryPy
server) and the connection to each DB will be kept alive (multiple requests
will certainly hit the same connection). I am using PRAGMA
cache_size=256MB, so it should cache the entire DB in memory.

If I only use a single database then I would expect that the query could be
served entirely from cached the copy of DB. Since we have 100 of those, I
could reasonably expect that there will be some reading from disk.

But for a smaller number of DBs, say 10, I would expect the performance
similar to in-memory.

The schema is roughly constituted of two tables, one is a full text index,
fts5_table, while the other is called datatable.

The query in question is a full text query on the full text index joined
with another table.

*SELECT* fts5.column, datatable.column2 *FROM* fts5_table, datatable *WHERE*
 fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid *ORDER BY*
 rank *ASC* *LIMIT* 10;

The total number 'term' matches in the documents is about 30000 documents
(given the orderby query, this means that all of those documents will need
to be ranked).

As I increase the number of threads I reach a peak performance at 4
threads, which is less than the number of available cores (after that, the
performance degrades). If I make my application use more threads (i.e. more
requests get processed in paralle) the performance linearly degrades.

All I am getting is ~40 requests per seconds (when we have 100 users making
queries in parallel). But I believe SQLite is capable of doing much more. I
am running on a Red Hat Linux on an Intel machine with 8-cores, 16-threads
and 64GB of system memory. Disks are SSD.

Thanks,

Gabriele


On Thu, May 18, 2017 at 7:51 AM, Eduardo Morras <[hidden email]> wrote:

> On Wed, 17 May 2017 22:18:19 -0700
> Gabriele Lanaro <[hidden email]> wrote:
>
> > Hi, I'm trying to assess if the performance of my application is
> > dependent on disk access from sqlite.
> >
> > To rule this out I wanted to make sure that the SQLite DB is
> > completely accessed from memory and there are no disk accesses.
> >
> > Is it possible to obtain this effect by using pragmas such as
> > cache_size?
> >
> > Another solution is to copy the existing db to a :memory: db but I'd
> > like to achieve the same effect without doing so (because it will
> > require substantial modification of the application). For the sake of
> > argument, let's image that using :memory: db is not an option.
> >
> > Also using a ramdisk is not an option because I don't have root
> > access to the machine.
>
> What OS are you using?
>
> You can next tips to make the app less dependant on disk I/O access:
>
> a) change where store temporal tables (mat views, subqueries, temp
> tables) and indices (transient, to use ram always (pragma
> temp_store=2),
>
> b) increase cache size, the more, the better (os disk cache is shared
> with other processes and is slower), if cache is equal or bigger than
> your db, it'll be fit in ram,
>
> c) set config SQLITE_CONFIG_STMTJRNL_SPILL=-1 if you use statement
> journal files (check http://www.sqlite.org/tempfiles.html#stmtjrnl),
>
> d) use wal mode to avoid *-shm files, and set wal_checkpointing,
>
> e) use mmap_size pragma to minimize I/O (check
> http://www.sqlite.org/mmap.html it has disadvanteges too)
>
>
> > Thanks,
> >
> > Gabriele
>
> HTH
>
> ---   ---
> Eduardo Morras <[hidden email]>
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite in memory

wmertens
Note that, as I understand it, if you use only a single connection for the
CherryPi server, all the threads on the server will be running the queries
sequentially. Try using a database connection per thread?

On Thu, May 18, 2017, 8:47 PM Gabriele Lanaro <[hidden email]>
wrote:

> Thanks everyone for all the tips! This is all very useful.
>
> We are using SQLite’s FTS5 feature to search a large number of text files.
> There are 50M records in total but they are split across 1000 smaller
> databases of 50K records each. Each DB is 250MB in size.
>
> I am trying to test query performance and for that I am issuing queries for
> same term over and over on a random subset of the databases.
>
> Each query will execute on its own thread (I am using Python’s CherryPy
> server) and the connection to each DB will be kept alive (multiple requests
> will certainly hit the same connection). I am using PRAGMA
> cache_size=256MB, so it should cache the entire DB in memory.
>
> If I only use a single database then I would expect that the query could be
> served entirely from cached the copy of DB. Since we have 100 of those, I
> could reasonably expect that there will be some reading from disk.
>
> But for a smaller number of DBs, say 10, I would expect the performance
> similar to in-memory.
>
> The schema is roughly constituted of two tables, one is a full text index,
> fts5_table, while the other is called datatable.
>
> The query in question is a full text query on the full text index joined
> with another table.
>
> *SELECT* fts5.column, datatable.column2 *FROM* fts5_table, datatable
> *WHERE*
>  fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid *ORDER
> BY*
>  rank *ASC* *LIMIT* 10;
>
> The total number 'term' matches in the documents is about 30000 documents
> (given the orderby query, this means that all of those documents will need
> to be ranked).
>
> As I increase the number of threads I reach a peak performance at 4
> threads, which is less than the number of available cores (after that, the
> performance degrades). If I make my application use more threads (i.e. more
> requests get processed in paralle) the performance linearly degrades.
>
> All I am getting is ~40 requests per seconds (when we have 100 users making
> queries in parallel). But I believe SQLite is capable of doing much more. I
> am running on a Red Hat Linux on an Intel machine with 8-cores, 16-threads
> and 64GB of system memory. Disks are SSD.
>
> Thanks,
>
> Gabriele
>
>
> On Thu, May 18, 2017 at 7:51 AM, Eduardo Morras <[hidden email]> wrote:
>
> > On Wed, 17 May 2017 22:18:19 -0700
> > Gabriele Lanaro <[hidden email]> wrote:
> >
> > > Hi, I'm trying to assess if the performance of my application is
> > > dependent on disk access from sqlite.
> > >
> > > To rule this out I wanted to make sure that the SQLite DB is
> > > completely accessed from memory and there are no disk accesses.
> > >
> > > Is it possible to obtain this effect by using pragmas such as
> > > cache_size?
> > >
> > > Another solution is to copy the existing db to a :memory: db but I'd
> > > like to achieve the same effect without doing so (because it will
> > > require substantial modification of the application). For the sake of
> > > argument, let's image that using :memory: db is not an option.
> > >
> > > Also using a ramdisk is not an option because I don't have root
> > > access to the machine.
> >
> > What OS are you using?
> >
> > You can next tips to make the app less dependant on disk I/O access:
> >
> > a) change where store temporal tables (mat views, subqueries, temp
> > tables) and indices (transient, to use ram always (pragma
> > temp_store=2),
> >
> > b) increase cache size, the more, the better (os disk cache is shared
> > with other processes and is slower), if cache is equal or bigger than
> > your db, it'll be fit in ram,
> >
> > c) set config SQLITE_CONFIG_STMTJRNL_SPILL=-1 if you use statement
> > journal files (check http://www.sqlite.org/tempfiles.html#stmtjrnl),
> >
> > d) use wal mode to avoid *-shm files, and set wal_checkpointing,
> >
> > e) use mmap_size pragma to minimize I/O (check
> > http://www.sqlite.org/mmap.html it has disadvanteges too)
> >
> >
> > > Thanks,
> > >
> > > Gabriele
> >
> > HTH
> >
> > ---   ---
> > Eduardo Morras <[hidden email]>
> >
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users