Is mutliple-execute-one-commit slower than multiple single-execute-single-commit?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Is mutliple-execute-one-commit slower than multiple single-execute-single-commit?

Peng Yu
Hi,

I have two python programs using sqlite3. They function the same,
except the following.

In the first, execute() is called in batches and then commit() is
called following them. In the second, commit() is called after each
execute(). It seems that the second case is faster (I can not separate
my code in a self-contained test case to show here).

This is counterintuitive. I thought the first should be faster.

Is it expected that the 2nd case should be slightly faster?

--
Regards,
Peng
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] Is mutliple-execute-one-commit slower than multiple single-execute-single-commit?

Hick Gunter
If you could provide more information then maybe someone can suggest a reason or even a solution for the effect you are seeing. Some of the following may be helpful.

What schema are you using?
Which journal mode is your database running in?
What kind of statements are executed?
How are you controlling transactions?
How are you measuring speed?
Can you replicate the problem by running the statements in the sqlite shell?

-----Urspr√ľngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Peng Yu
Gesendet: Mittwoch, 29. Januar 2020 07:42
An: SQLite mailing list <[hidden email]>
Betreff: [EXTERNAL] [sqlite] Is mutliple-execute-one-commit slower than multiple single-execute-single-commit?

Hi,

I have two python programs using sqlite3. They function the same, except the following.

In the first, execute() is called in batches and then commit() is called following them. In the second, commit() is called after each execute(). It seems that the second case is faster (I can not separate my code in a self-contained test case to show here).

This is counterintuitive. I thought the first should be faster.

Is it expected that the 2nd case should be slightly faster?

--
Regards,
Peng
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is mutliple-execute-one-commit slower than multiple single-execute-single-commit?

Keith Medcalf
In reply to this post by Peng Yu

On Tuesday, 28 January, 2020 23:42, Peng Yu <[hidden email]> wrote:

>I have two python programs using sqlite3. They function the same,
>except the following.

I presume this means you are using the standard (as in included with the standard Python distribution) sqlite3 module?  There are other ways to use SQLite3 (the database) from Python, the sqlite3 (pysqlite2) wrapper being only one of them.  What are the statements being executed?  Are you using "magical" mode for the wrapper (what is the isolation_level set to on the sqlite3.connect call).

>In the first, execute() is called in batches and then commit() is
>called following them. In the second, commit() is called after each
>execute(). It seems that the second case is faster (I can not separate
>my code in a self-contained test case to show here).

>This is counterintuitive. I thought the first should be faster.

>Is it expected that the 2nd case should be slightly faster?

The first case (batching multiple inserts in a single transaction) should be faster.  On the below test disk I/O rate exceeded 100 MB/s for wal mode.

-----//----- insertspeed.py -----//-----
import sqlite3
import sys
import time

db = sqlite3.connect('test.db')

db.executescript('create table if not exists x(x)')

records = 1000000

print(sys.version)
print('sqlite3 wrapper version', sqlite3.version, 'using library version', sqlite3.sqlite_version)
print()
for mode in ['wal', 'delete']:
    print('Database mode', mode)
    print('-------- -------- --------')
    print(' Records    Batch  Seconds')
    print('-------- -------- --------')
    for batchsize in [1000000, 100000, 10000, 1000, 100, 10, 1]:
        db.executescript('pragma journal_mode=delete')
        db.executescript('delete from x')
        db.executescript('vacuum')
        db.execute('pragma journal_mode=%s' % mode)
        t = time.time()
        for i in range(records):
            db.execute('insert into x values (?)', (i,))
            if i % batchsize == 0:
                db.commit()
        db.commit()
        print('%8d %8d %8.3f' % (records, batchsize, time.time() - t))
    print('-------- -------- --------')

-----//--------------------------//-----

3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
sqlite3 wrapper version 2.6.0 using library version 3.31.1

Database mode wal
-------- -------- --------
 Records    Batch  Seconds
-------- -------- --------
 1000000  1000000    2.255
 1000000   100000    2.348
 1000000    10000    2.344
 1000000     1000    2.453
 1000000      100    3.355
 1000000       10    8.856
 1000000        1   64.167
-------- -------- --------
Database mode delete
-------- -------- --------
 Records    Batch  Seconds
-------- -------- --------
 1000000  1000000    2.215
 1000000   100000    2.460
 1000000    10000    3.320
 1000000     1000   12.099
 1000000      100   96.896

I couldn't wait for the smaller batches to complete in journal_mode delete.  I would expect them to take a looong time ...

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Is mutliple-execute-one-commit slower than multiple single-execute-single-commit?

Richard Damon
In reply to this post by Peng Yu
On 1/29/20 1:42 AM, Peng Yu wrote:

> Hi,
>
> I have two python programs using sqlite3. They function the same,
> except the following.
>
> In the first, execute() is called in batches and then commit() is
> called following them. In the second, commit() is called after each
> execute(). It seems that the second case is faster (I can not separate
> my code in a self-contained test case to show here).
>
> This is counterintuitive. I thought the first should be faster.
>
> Is it expected that the 2nd case should be slightly faster?
>
One thought is that if the execute creates a lot of data for the
database, then multiple executes could exceed the memory cache, causing
it to spill to the database, and then the commit needs to read that back
and put it into the right place, while a commit after each execute keeps
everything in memory until the commit writes it to the database.

--
Richard Damon

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users