SQLite slow when lots of tables

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: SQLite slow when lots of tables

wmertens
I always have to explain to people that there's no magic sauce that "real
databases" add versus SQLite.

SQLite uses the same techniques all databases use, and thanks to the
absence of a network later, you avoid a lot of latency, so it can actually
be faster.

(I do believe that SQLite optimizes for smaller disk footprint, so that may
be a tradeoff where other databases might gain speed)

Wout.

On Tue, Jan 29, 2019, 2:07 PM Rob Willett <[hidden email]
wrote:

> Millions of rows is not even large, never mind huge or very huge :)
>
> We have tables with hundreds of millions of rows, we got to billions of
> rows in a single table before we changed the logic. From memory we had
> 67GB for a single database and I reckon 60GB was one table. Not many
> issues at all with inserting or searching. One of our data mining
> queries searched the entire table and it still only took 90 secs, though
> all of the query used indexes.
>
> We only changed what we store as the management of a circa 60GB database
> was too much and we worked out we only needed 1% of it. We were using a
> virtual private server and we had issues with disk IO when we copied the
> database around using Unix cp. This wasn't a SQLite problem at all.
> However I have no doubt that SQLite was more than capable of handling
> even more data.
>
> Rob
>
> On 29 Jan 2019, at 11:00, [hidden email] wrote:
>
> > Dear all,
> >
> > what happens if I put all data in a single table and this table become
> > very huge (for example millions of rows)?
> >
> > Will I have same performace problems?
> >
> > Thanks.
> >
> >
> > Regards.
> >
> >>
> >>     Il 28 gennaio 2019 alle 17.28 Simon Slavin <[hidden email]>
> >> ha scritto:
> >>
> >>     On 28 Jan 2019, at 4:17pm, [hidden email] wrote:
> >>
> >>         > >
> >>>         when the number of the tables become huge (about 15000/20000
> >>> tables) the first DataBase reading query, after Database open, is
> >>> very slow (about 4sec.) while next reading operations are faster.
> >>>
> >>>         How can I speed up?
> >>>
> >>>     >
> >>     Put all the data in the same table.
> >>
> >>     At the moment, you pick a new table name each time you write
> >> another set of data to the database. Instead of that, create just one
> >> big table, and add an extra column to the columns which already exist
> >> called "dataset". In that you put the string you previously used as
> >> the table name.
> >>
> >>     SQL is not designed to have a variable number of tables in a
> >> database. All the optimization is done assuming that you will have a
> >> low number of tables, and rarely create or drop tables.
> >>
> >>     Simon.
> >>
> >>     _______________________________________________
> >>     sqlite-users mailing list
> >>     [hidden email]
> >>
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >>
> > _______________________________________________
> > sqlite-users mailing list
> > [hidden email]
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite slow when lots of tables

Keith Medcalf
On Tuesday, 29 January, 2019 16:28, Wout Mertens wrote:

>To: SQLite mailing list
>Subject: Re: [sqlite] SQLite slow when lots of tables
>
> I always have to explain to people that there's no magic sauce that
> "real databases" add versus SQLite.

> SQLite uses the same techniques all databases use, and thanks to the
> absence of a network later, you avoid a lot of latency, so it can
> actually be faster.

> (I do believe that SQLite optimizes for smaller disk footprint, so
> that may be a tradeoff where other databases might gain speed)

Well, there is a bit more to it than that.  The SQL interface and the capabilities may be similar but paying $$$$$ for a client/server database does get you something that you do not get with SQLite3:

 - Someone else wrote the networking layer, so if you want one, you do not have to do it yourself
 - C/S databases are designed to handle REAMS (ie, thousands) of remote clients doing simultaneous access, so the concurrency is much greater
 - C/S databases are parallelized and multithreaded, meaning that the single server process can use all the cores on your server simultaneously
 - C/S databases may be parallelized so that a single query uses all available CPU cores simultaneously
 - C/S databases may be NUMA and/or SYSPLEX enabled so that a single database can be scattered across multiple disparate machines yet queried as if there was only one database on one machine
 - C/S databases have more complicated data fetching methods (ie, they can use such things as hash tables, intersection sorts, and all sorts of other methods to fetch your data rather than just the nested loop retrieval supported by SQLite)
 - Some C/S databases may have extremely complex planners designed to take maximum advantage of all the above things simultaneously (at a cost, of course)

Of course, other than the big bucks you will spend on the C/S database, you will also have to spend big bucks to give it a big computer to run on.  That means lots of fast I/O and gobs of RAM and CPU.  Of course, in the grand scheme of things a 48 or 96 core x64 server with a terrabyte of RAM and a couple hundred (or thousand) terrabytes of SSD is not really that expensive.

So really, it depends on your needs and what you are willing to pay,  In many cases for a single user or even a small number of concurrent users on a single computer will likely achieve better performance (and cost efficiency) by using SQLite3.

On the other hand if the application is an line-of-business database for a Fortune 5 multinational corporation, you will probably choose the C/S database (particularly if you are running some <redacted> software like SAP ...).




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite slow when lots of tables

Igor Korot
Hi,

On Tue, Jan 29, 2019 at 7:06 PM Keith Medcalf <[hidden email]> wrote:

>
> On Tuesday, 29 January, 2019 16:28, Wout Mertens wrote:
>
> >To: SQLite mailing list
> >Subject: Re: [sqlite] SQLite slow when lots of tables
> >
> > I always have to explain to people that there's no magic sauce that
> > "real databases" add versus SQLite.
>
> > SQLite uses the same techniques all databases use, and thanks to the
> > absence of a network later, you avoid a lot of latency, so it can
> > actually be faster.
>
> > (I do believe that SQLite optimizes for smaller disk footprint, so
> > that may be a tradeoff where other databases might gain speed)
>
> Well, there is a bit more to it than that.  The SQL interface and the capabilities may be similar but paying $$$$$ for a client/server database does get you something that you do not get with SQLite3:
>
>  - Someone else wrote the networking layer, so if you want one, you do not have to do it yourself
>  - C/S databases are designed to handle REAMS (ie, thousands) of remote clients doing simultaneous access, so the concurrency is much greater
>  - C/S databases are parallelized and multithreaded, meaning that the single server process can use all the cores on your server simultaneously
>  - C/S databases may be parallelized so that a single query uses all available CPU cores simultaneously
>  - C/S databases may be NUMA and/or SYSPLEX enabled so that a single database can be scattered across multiple disparate machines yet queried as if there was only one database on one machine
>  - C/S databases have more complicated data fetching methods (ie, they can use such things as hash tables, intersection sorts, and all sorts of other methods to fetch your data rather than just the nested loop retrieval supported by SQLite)
>  - Some C/S databases may have extremely complex planners designed to take maximum advantage of all the above things simultaneously (at a cost, of course)
>
> Of course, other than the big bucks you will spend on the C/S database, you will also have to spend big bucks to give it a big computer to run on.  That means lots of fast I/O and gobs of RAM and CPU.  Of course, in the grand scheme of things a 48 or 96 core x64 server with a terrabyte of RAM and a couple hundred (or thousand) terrabytes of SSD is not really that expensive.
>
> So really, it depends on your needs and what you are willing to pay,  In many cases for a single user or even a small number of concurrent users on a single computer will likely achieve better performance (and cost efficiency) by using SQLite3.
>
> On the other hand if the application is an line-of-business database for a Fortune 5 multinational corporation, you will probably choose the C/S database (particularly if you are running some <redacted> software like SAP ...).

All this is not necessary true.
You can install mySQL/MariaDB for free and use it for your needs.
I believe that if you pay to Oracle/MariaDB Foundation, it will be
just for support. (I may be wrong though).

You can also install SAP ASE (former Sybase) and use it in production
with up to 4 connection for free.

I also think you can install Oracle on you box for free, but I'm not
sure about licenses for the big Oracle.

So, there are options even for the "Big DBMS" that are free.

Thank you.

>
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite slow when lots of tables

Peter da Silva-2
On Tue, Jan 29, 2019, 7:46 PM Igor Korot <[hidden email] wrote:

> You can install mySQL/MariaDB for free and use it for your needs.
> I believe that if you pay to Oracle/MariaDB Foundation, it will be
> just for support. (I may be wrong though).
>

Or better, PostgreSQL. We have a system that uses SQLite as an optional
cache for PostgreSQL and takes advantage of the similarity of their SQL
variants to get the best of both worlds. I modified the Pgtcl library to
even get the variable binding for PostgreSQL to match SQLite syntax and
semantics.

>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite slow when lots of tables

Igor Korot
Peter,

On Tue, Jan 29, 2019 at 7:53 PM Peter da Silva <[hidden email]> wrote:

>
> On Tue, Jan 29, 2019, 7:46 PM Igor Korot <[hidden email] wrote:
>
> > You can install mySQL/MariaDB for free and use it for your needs.
> > I believe that if you pay to Oracle/MariaDB Foundation, it will be
> > just for support. (I may be wrong though).
> >
>
> Or better, PostgreSQL. We have a system that uses SQLite as an optional
> cache for PostgreSQL and takes advantage of the similarity of their SQL
> variants to get the best of both worlds. I modified the Pgtcl library to
> even get the variable binding for PostgreSQL to match SQLite syntax and
> semantics.

That is correct.
And with PG you will get extra security as a bonus, as I think it is
still considered
a very secure DBMS.

Thank you.

>
> >
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: SQLite slow when lots of tables

wmertens
In reply to this post by Keith Medcalf
 Ah yes very true, it's easy to forget ones biases - I make single threaded
web services with mostly-read access. This is a great use case for sqlite
(provided you solve the data distribution problem).

Wout.

On Wed, Jan 30, 2019, 2:06 AM Keith Medcalf <[hidden email] wrote:

> On Tuesday, 29 January, 2019 16:28, Wout Mertens wrote:
>
> >To: SQLite mailing list
> >Subject: Re: [sqlite] SQLite slow when lots of tables
> >
> > I always have to explain to people that there's no magic sauce that
> > "real databases" add versus SQLite.
>
> > SQLite uses the same techniques all databases use, and thanks to the
> > absence of a network later, you avoid a lot of latency, so it can
> > actually be faster.
>
> > (I do believe that SQLite optimizes for smaller disk footprint, so
> > that may be a tradeoff where other databases might gain speed)
>
> Well, there is a bit more to it than that.  The SQL interface and the
> capabilities may be similar but paying $$$$$ for a client/server database
> does get you something that you do not get with SQLite3:
>
>  - Someone else wrote the networking layer, so if you want one, you do not
> have to do it yourself
>  - C/S databases are designed to handle REAMS (ie, thousands) of remote
> clients doing simultaneous access, so the concurrency is much greater
>  - C/S databases are parallelized and multithreaded, meaning that the
> single server process can use all the cores on your server simultaneously
>  - C/S databases may be parallelized so that a single query uses all
> available CPU cores simultaneously
>  - C/S databases may be NUMA and/or SYSPLEX enabled so that a single
> database can be scattered across multiple disparate machines yet queried as
> if there was only one database on one machine
>  - C/S databases have more complicated data fetching methods (ie, they can
> use such things as hash tables, intersection sorts, and all sorts of other
> methods to fetch your data rather than just the nested loop retrieval
> supported by SQLite)
>  - Some C/S databases may have extremely complex planners designed to take
> maximum advantage of all the above things simultaneously (at a cost, of
> course)
>
> Of course, other than the big bucks you will spend on the C/S database,
> you will also have to spend big bucks to give it a big computer to run on.
> That means lots of fast I/O and gobs of RAM and CPU.  Of course, in the
> grand scheme of things a 48 or 96 core x64 server with a terrabyte of RAM
> and a couple hundred (or thousand) terrabytes of SSD is not really that
> expensive.
>
> So really, it depends on your needs and what you are willing to pay,  In
> many cases for a single user or even a small number of concurrent users on
> a single computer will likely achieve better performance (and cost
> efficiency) by using SQLite3.
>
> On the other hand if the application is an line-of-business database for a
> Fortune 5 multinational corporation, you will probably choose the C/S
> database (particularly if you are running some <redacted> software like SAP
> ...).
>
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
12