Slowness with in-memory database

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Slowness with in-memory database

CARTER-HITCHIN, David, GBM
Hi,

Really hope someone out there can help me with this.

I've written a C++ cache class which extracts ad-hoc data from a Sybase
database and inserts it into an in-memory SQLite database.  This is working
fine but is very slow during the insert data phase.  To give some
quantitative idea of what I mean by slow:

a) If I run the extract with the Sybase isql command line utility and dump
that to a file it takes 2/3 minutes (~800,000 rows, 100MB of data).
b) If I run my cache class with the insert into SQLite commented out it
takes 2/3 minutes.
c) Uncommenting the insert statement, the extract/insert takes ~45 mins!

Something is misconfigured as it takes ~20 times as long to insert into an
in-memory db, as it does to select from Sybase and store to a file.  I had a
look through the docs and had a google, and came up with some PRAGMA
options, so I tried:

PRAGMA synchronous = OFF
PRAGMA temp_store  = MEMORY

Sadly, these made no difference.  Does anyone have any idea what is causing
this?  Code is running on Solaris 8 on a Sun workstation 1GB of memory 2x
450MHz CPU, SQLite version 3.2.7.  I'm opening the database and calling
sqlite3_exec in a pretty standard way:

// Open the database:

    sqlite3 *cache;
    rc = sqlite3_open(":memory:", &cache)
    if ( rc )
    {
      fprintf(stderr, "Can't open database: %s\n", sqlite3_errmsg(cache));
      sqlite3_close(cache);
      exit(1);
    } else {

      // set the database params
      string sql  = "PRAGMA synchronous = OFF\n";
      sql += "PRAGMA temp_store  = MEMORY\n";

      rc = sqlite3_exec(cache, sql.c_str(), NULL, 0, &zErrMsg);
      if ( rc != SQLITE_OK )
        {
          fprintf(stderr, "SQL error: %s\nSQL Text: %s", zErrMsg,
sql.c_str());
        }
    }

// Create some tables = called once per table:

      rc = sqlite3_exec(cache, create_sql.c_str(), NULL, 0, &zErrMsg);
      if ( rc != SQLITE_OK )
        {
          fprintf(stderr, "SQL error: %s\nSQL text: %s\n", zErrMsg,
create_sql.c_str());
        }

// Then the insert - called 800,000 times for the extract that takes 45
minutes :

        rc = sqlite3_exec(cache, populate_sql.c_str(), NULL, 0, &zErrMsg);
      if ( rc != SQLITE_OK )
            {
              fprintf(stderr, "SQL error: %s\nSQL text: %s\n\n", zErrMsg,
populate_sql.c_str());
            }

Any ideas, however slight, will be much appreciated!

Many thanks,
 

David Carter-Hitchin.
--LongSig


***********************************************************************************
The Royal Bank of Scotland plc. Registered in Scotland No 90312.       Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB.                                      
Authorised and regulated by the Financial Services Authority    
 
This e-mail message is confidential and for use by the                      
addressee only. If the message is received by anyone other            
than the addressee, please return the message to the sender          
by replying to it and then delete the message from your                    
computer. Internet e-mails are not necessarily secure. The              
Royal Bank of Scotland plc does not accept responsibility for          
changes made to this message after it was sent.                              
                                                                                                       
Whilst all reasonable care has been taken to avoid the                  
transmission of viruses, it is the responsibility of the recipient to        
ensure that the onward transmission, opening or use of this            
message and any attachments will not adversely affect its              
systems or data.  No responsibility is accepted by The Royal          
Bank of Scotland plc in this regard and the recipient should carry  
out such virus and other checks as it considers appropriate.          
                                                                                                               Visit our websites at:                                                                          
http://www.rbs.co.uk/CBFM                                                       
http://www.rbsmarkets.com                                                         
                                                                                                       ********************************************************************************

Reply | Threaded
Open this post in threaded view
|

RE: Slowness with in-memory database

Griggs, Donald
Regarding: ... Something is misconfigured as it takes ~20 times as long to
insert into an in-memory db, as it does to select from Sybase and store to a
file.  ...


I'm not sure you're using transactions.  If not, BEGIN a transaction before
starting the loop of INSERTs and END it when done.   This may increase your
speed tremendously.

http://www.sqlite.org/lang_transaction.html

(Depending on your needs, you may possibly decide you want to complete a
transaction for every, say, 10,000 inserts instead of waiting for all
800,000.)

If using transactions still leaves you hungry for speed, you might decide to
reinstate the
   PRAGMA synchronous = OFF
that you mentioned, perhaps turning in on again after the tables are built.

If you need indexes, adding them after the tables are built can improve your
overall time.



Donald Griggs

Opinions are not necessarily those of Misys Healthcare Systems nor its board
of directors.


-----Original Message-----
From: CARTER-HITCHIN, David, FM [mailto:[hidden email]]
Sent: Wednesday, December 21, 2005 9:32 AM
To: '[hidden email]'
Subject: [sqlite] Slowness with in-memory database

Reply | Threaded
Open this post in threaded view
|

Re: Slowness with in-memory database

D. Richard Hipp
In reply to this post by CARTER-HITCHIN, David, GBM
"CARTER-HITCHIN, David, FM" <[hidden email]> wrote:

> Hi,
>
> Really hope someone out there can help me with this.
>
> I've written a C++ cache class which extracts ad-hoc data from a Sybase
> database and inserts it into an in-memory SQLite database.  This is working
> fine but is very slow during the insert data phase.  To give some
> quantitative idea of what I mean by slow:
>
> a) If I run the extract with the Sybase isql command line utility and dump
> that to a file it takes 2/3 minutes (~800,000 rows, 100MB of data).
> b) If I run my cache class with the insert into SQLite commented out it
> takes 2/3 minutes.
> c) Uncommenting the insert statement, the extract/insert takes ~45 mins!
>
> Something is misconfigured as it takes ~20 times as long to insert into an
> in-memory db, as it does to select from Sybase and store to a file.
>
> Any ideas, however slight, will be much appreciated!
>

Idea 1:  You are trashing.  What OS are you running and how much memory
         is installed in the machine.

Idea 2:  Might you be doing an insert trigger that is taking a long time?

Idea 3:  You have about 100 different indices on the table that all have
         to be updated with each insert.

FWIW, SQLite normally does about 50000 inserts per second, not the 300
you are seeing.  So something is clearly wrong.
--
D. Richard Hipp <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: Slowness with in-memory database

Jay Sprenkle
Are the inserts inside a transaction?
If you're doing a lot of identical inserts you can prepare your
statement so it's parsed once and then just execute it repeatedly.
Reply | Threaded
Open this post in threaded view
|

Re: Slowness with in-memory database

D. Richard Hipp
In reply to this post by CARTER-HITCHIN, David, GBM
"Griggs, Donald" <[hidden email]> wrote:
> Regarding: ... Something is misconfigured as it takes ~20 times as long to
> insert into an in-memory db, as it does to select from Sybase and store to a
> file.  ...
>
>
> I'm not sure you're using transactions.  If not, BEGIN a transaction before
> starting the loop of INSERTs and END it when done.   This may increase your
> speed tremendously.
>

For an in-memory database, using transactions should not effect the
speed at all.  Here's why:

SQLite easily handles 50000 inserts per second, but only about 60
transactions per second.  Normally, SQLite puts each insert into
a separate transaction, which then limits you to about 60 inserts
per second.  By using BEGIN...COMMIT you can group multiple inserts
into the same transaction thus increasing your insert rate.

Transactions are slow due to limitations of computer disk hardware.
When SQLite writes to the disk, it has to stop and wait at a couple of
places to for all of the data to actually be written to the disk
surface.  This is necessary in case a power failure or OS crash occurs -
so that the data can be recovered.  It is this stopping and waiting that
takes so long.  If we didn't need to wait for the disk platter to
spin underneath the head, SQLite could easily handle tens of thousands
of transactions per second.  But because each transaction requires at
least two revolutions of the disk platter, SQLite is limited to about
60 transactions per second.  (Do the math:  7200 RPM disk drive means
120 revolutions per second.  2 complete revolutions per transaction ->
60 transactions per second.)

But there are no such restrictions on an in-memory database.  With an
in-memory database we never have to wait for the platter to spin under
the write head.  So transactions are very, very fast.  Hence grouping
multiple inserts into a single transaction does not really buy you
anything when using an in-memory database.  Only databases that are
stored on spinning media are helped by that trick.

--
D. Richard Hipp <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

RE: Slowness with in-memory database

CARTER-HITCHIN, David, GBM
In reply to this post by CARTER-HITCHIN, David, GBM
Hi Richard et al.,

1. It's not thrashing - I've watched the process grow with prstat -cp `pgrep
gti` and it doesn't top 100MB - as I said I've got 1GB of RAM in my
workstation, with 2.5GB of free swapspace.

2/3.  There are no triggers or indexes on the table - they are very simple
and vanilla tables, with simple datatypes - just int's and varchar's.

Donald Griggs and Jay Sprenkle have suggested transactions, which I had not
been using, and I've wrapped the all 800,000 iterations into one tran and
that has reduced the time down to 6 minutes - thanks Donald and Jay!!
Problem fixed - it's now taking as much time to ship the data into memory as
it does to ship it out of Sybase - I'm happy enough about that.  I guess it
could be optimised more, but this is ok.

I've just read your next post about transactions viz-a-viz in-memory db's -
well I see where you're coming from, but the evidence is clearly in favour
of transactions here.  My other thought is that opening the db with a
filehandle of ":memory:" is perhaps not 100% correct - that it's somehow
sneakily going off to disk somewhere - can anyone say?

Thanks,

David Carter-Hitchin.
--
Royal Bank of Scotland
Interest Rate Derivatives IT
135 Bishopsgate
LONDON EC2M 3TP

Tel: +44 (0) 207 085 1088


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]]
> Sent: 21 December 2005 14:54
> To: [hidden email]
> Subject: Re: [sqlite] Slowness with in-memory database
>
>
> "CARTER-HITCHIN, David, FM" <[hidden email]> wrote:
> > Hi,
> >
> > Really hope someone out there can help me with this.
> >
> > I've written a C++ cache class which extracts ad-hoc data
> from a Sybase
> > database and inserts it into an in-memory SQLite database.  
> This is working
> > fine but is very slow during the insert data phase.  To give some
> > quantitative idea of what I mean by slow:
> >
> > a) If I run the extract with the Sybase isql command line
> utility and dump
> > that to a file it takes 2/3 minutes (~800,000 rows, 100MB of data).
> > b) If I run my cache class with the insert into SQLite
> commented out it
> > takes 2/3 minutes.
> > c) Uncommenting the insert statement, the extract/insert
> takes ~45 mins!
> >
> > Something is misconfigured as it takes ~20 times as long to
> insert into an
> > in-memory db, as it does to select from Sybase and store to a file.
> >
> > Any ideas, however slight, will be much appreciated!
> >
>
> Idea 1:  You are trashing.  What OS are you running and how
> much memory
>          is installed in the machine.
>
> Idea 2:  Might you be doing an insert trigger that is taking
> a long time?
>
> Idea 3:  You have about 100 different indices on the table
> that all have
>          to be updated with each insert.
>
> FWIW, SQLite normally does about 50000 inserts per second, not the 300
> you are seeing.  So something is clearly wrong.
> --
> D. Richard Hipp <[hidden email]>
>


***********************************************************************************
The Royal Bank of Scotland plc. Registered in Scotland No 90312.       Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB.                                      
Authorised and regulated by the Financial Services Authority    
 
This e-mail message is confidential and for use by the                      
addressee only. If the message is received by anyone other            
than the addressee, please return the message to the sender          
by replying to it and then delete the message from your                    
computer. Internet e-mails are not necessarily secure. The              
Royal Bank of Scotland plc does not accept responsibility for          
changes made to this message after it was sent.                              
                                                                                                       
Whilst all reasonable care has been taken to avoid the                  
transmission of viruses, it is the responsibility of the recipient to        
ensure that the onward transmission, opening or use of this            
message and any attachments will not adversely affect its              
systems or data.  No responsibility is accepted by The Royal          
Bank of Scotland plc in this regard and the recipient should carry  
out such virus and other checks as it considers appropriate.          
                                                                                                               Visit our websites at:                                                                          
http://www.rbs.co.uk/CBFM                                                       
http://www.rbsmarkets.com                                                         
                                                                                                       ********************************************************************************

Reply | Threaded
Open this post in threaded view
|

Re: Slowness with in-memory database

Jay Sprenkle
> Donald Griggs and Jay Sprenkle have suggested transactions, which I had not
> been using, and I've wrapped the all 800,000 iterations into one tran and
> that has reduced the time down to 6 minutes - thanks Donald and Jay!!
> Problem fixed - it's now taking as much time to ship the data into memory as
> it does to ship it out of Sybase - I'm happy enough about that.  I guess it
> could be optimised more, but this is ok.
>
> I've just read your next post about transactions viz-a-viz in-memory db's -
> well I see where you're coming from, but the evidence is clearly in favour
> of transactions here.  My other thought is that opening the db with a
> filehandle of ":memory:" is perhaps not 100% correct - that it's somehow
> sneakily going off to disk somewhere - can anyone say?

I guess there's more overhead doing individual transactions, even in memory,
than DRH thought. It was very interesting to know what was happening under
the covers. Thanks Dr H.