THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

test user
Hello,

Whats the difference between these two options for the SQLITE_THREADSAFE
compile time flag?

From the docs:

(1, SERIALIZED)

   - Safe for use in a multithreaded environment
   - Enables all mutexes including the recursive mutexes on database
   connection and prepared statement objects.
   - The SQLite library will itself *serialize access to database
   connections* and prepared statements


(2, MULTITHREAD)

   - Can be used in a multithreaded program so long as no two threads
   attempt to use the same database connection (or any prepared statements
   derived from that database connection) at the same time.



When SQLITE_THREADSAFE=(1, SERIALIZED), how are the reads/writes queued? Is
it per connection, file, or process?

What happens when a request is in the queue, does it just wait until it can
be actioned, or return SQLITE_BUSY?

On a 8 core machine, how would I get optimal read throughput from a single
database file?

If I have many database connections to the same file, can they all read
concurrently on all cores?


Thanks,
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

Keith Medcalf

On Monday, 5 August, 2019 17:23, test user <[hidden email]> wrote:

>Whats the difference between these two options for the
>SQLITE_THREADSAFE compile time flag?

>From the docs:

>(1, SERIALIZED)
>(2, MULTITHREAD)

The SQLite3 library code is not multiply re-entrant, but is only singly-entrant on each connection.  (It is, however, multiply entrant provided that those entrances each are on a different connection)  This is because the SQLite3 connection pointer is a pointer to a structure that contains information pertaining to that connection, and a statement (or blob handle) is a sub-construct of the parent connection from which it was made.  Therefore, you may only make ONE CALL at a time into the SQLite3 library per connection (or sub-construct thereof), because those calls will mutate data associated with the connection.  It does not matter from whence that call originated (as in thread) merely that there can only be one active at a time per connection.  Period.

The difference between SINGLETHREAD, SERIALIZED and MULTITHREAD is in how this is guaranteed.  

When the THREADSAFE parameter is set to SEARALIZED this means that the SQLite3 library itself will protect the connection data with a mutex to make sure that you do not violate this constraint by "serializing" simultaneous calls into the library on a single connection.  This means that if you make two simultaneous calls on the same connection (from different threads, for example), one of them will get the mutex and proceed immediately, and the other one will have to wait until that mutex is released, thus ensuring that the single-entrance requirement is met.

When the THREADSAFE parameter is set to MULTITHREAD this means that the SQLite3 library WILL NOT protect the connection data with a mutex to protect you from violating this constraint, and that it is entirely and completely your responsibility to ensure that you do not violate the single-entrance (per connection) requirement.  If you do violate the single entrance requirement, you may corrupt the database, the library, the computer, the world, and cause the end of the universe.

If and only if you are absolutely sure that you are complying with the single-entrance requirement THEN you can change the THREADSAFE mode from SERIALIZED to MULTITHREAD, which will save you a few nanoseconds per call into the library because the mutexes will no longer be checked.

When the THREADSAFE is set to either SERIALIZED or MULTITHREADED the code to handle these mutexes is compiled into the library.  You can switch between these two modes at runtime.

If you are only using one thread then you can set the THREADSAFE parameter to SINGLETHREAD which will cause the mutex code to be omitted entirely, saving you another nanosecond per call since you will not even need to "jump around" the mutex checking code.

When you compile the library with THREADSAFE set to SINGLETHREAD then the mutex protection code IS NOT compiled into the library and therefore you cannot turn it on or off at runtime, since it does not exist.

You are free to use as many threads as you like to call into the SQLite3 library no matter what THREADSAFE mode is set at compile or runtime if you are absolutely sure that you are complying with the single-entrance per connection requirement.

The THREADSAFE parameter merely sets the level of suspenders that you wish to have to protect against shoddy programming or non-deterministic languages (ie, those that do things such as asynchronous garbage collection or destructors).

SINGLETHREAD means leave out the protection code altogether
SERIALIZED means to include the code and use a mutex to protect against concurrent entry on a single connection
MULTITHREAD means to include the code but "jump around it" so that it is not used

>When SQLITE_THREADSAFE=(1, SERIALIZED), how are the reads/writes
>queued? Is it per connection, file, or process?

SQLITE_THREADSAFE has absolutely no impact on anything OTHER THAN protecting the connection data against multiple concurrent modification.

I don't know exactly what you mean by "how are read/writes queued".  If you mean I/O, this is an Operating System issue and has nothing to do with SQLite3.  When the code requests to read data it issues a read to the Operating System.  When it wants to write, it issues a write to the Operating System.  How the Operating System carries out those operations is not within the purview of a user program.

>What happens when a request is in the queue, does it just wait until
>it can be actioned, or return SQLITE_BUSY?

What do you mean by "request is in the quere"?  There is no queue.

>On a 8 core machine, how would I get optimal read throughput from a
>single database file?

That depends on the speed of the I/O device and the speed of a core, and what processing you are doing between reads, what you are reading, the capabilities of the operating system, and the multiprogramming ratio achieved by your code.

>If I have many database connections to the same file, can they all
>read concurrently on all cores?

Again this is an Operating System issue and is not a user program issue.  However, generally speaking there is only one channel (physical set of wires) connecting the storage device containing the "database file" to the central processor of the computer.  This means that the I/O channel can only perform one operation at a time.  The Operating System may make it appear that multiple things are happening at once, but in reality they are not.  This is really an Operating System issue.  The SQLite3 library issues a read request to the operating system when it wants to read data, and a write request when it wants to write.  How your user program utilizes "the cores" is not under the control of SQLite3.  How the OS deals with read and write requests is an Operating System issue and is not within the control of a user program (nor of a library such as SQLite3 which merely requests the Operating System to perform the actual I/O).

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

Adrian Ho
On 6/8/19 8:55 AM, Keith Medcalf wrote:
> The THREADSAFE parameter merely sets the level of suspenders that you
> wish to have to protect against shoddy programming or
> non-deterministic languages (ie, those that do things such as
> asynchronous garbage collection or destructors).

Or one of the fundamental constants of the development universe:
*changing requirements* that violate long-forgotten assumptions like
"there will be only one".

Been there, stepped on the Lego, got the bloodied T-shirt.

--
Best Regards,
Adrian

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

test user
In reply to this post by Keith Medcalf
Thanks,

When SQLITE_THREADSAFE=(1, SERIALIZED), how are the reads/writes
queued? Is it per connection, file, or process?


Quote: I don't know exactly what you mean by "how are read/writes queued".

Quote: if you make two simultaneous calls on the same connection (from
different threads, for example), one of them will get the mutex and proceed
immediately, and the other one will have to wait until that mutex is
released



What I mean by "queue" is that if one request is waiting until a mutex is
released, it is in an implicit queue for that resource.

I wanted to know how this queue works:

- Does the second request just take longer to return from the FFI call
whilst waiting for the mutex?

- Or does the FFI return SQLITE_BUSY and expect the caller to call again in
the future?


In other words: How should I handle this in my library that uses the FFI?



So in summary, there is no difference in the multi threaded performance
that can be gained between SERIALIZED and MULTITHREADED (aside from the
mutex overhead)? The only difference is SERIALIZED enforces correct usage
at a small overhead cost?


So for example, if I had:

- 8 cores
- 8 threads
- 8 db connections, 1 per thread
- 1 database file
- x amount of read requests per second

If I were to load balance x requests over each of the 8 threads, all the
reads would complete concurrently when in SERIALIZED mode, with WAL enabled?

Assume other bottlenecks in the system are not an issue (like disk speed).

Im just trying to confirm that SERIALIZED will not queue up requests for (1
file, multiple connections to that file, read only requests).

On Tue, Aug 6, 2019 at 1:55 AM Keith Medcalf <[hidden email]> wrote:

>
> On Monday, 5 August, 2019 17:23, test user <[hidden email]>
> wrote:
>
> >Whats the difference between these two options for the
> >SQLITE_THREADSAFE compile time flag?
>
> >From the docs:
>
> >(1, SERIALIZED)
> >(2, MULTITHREAD)
>
> The SQLite3 library code is not multiply re-entrant, but is only
> singly-entrant on each connection.  (It is, however, multiply entrant
> provided that those entrances each are on a different connection)  This is
> because the SQLite3 connection pointer is a pointer to a structure that
> contains information pertaining to that connection, and a statement (or
> blob handle) is a sub-construct of the parent connection from which it was
> made.  Therefore, you may only make ONE CALL at a time into the SQLite3
> library per connection (or sub-construct thereof), because those calls will
> mutate data associated with the connection.  It does not matter from whence
> that call originated (as in thread) merely that there can only be one
> active at a time per connection.  Period.
>
> The difference between SINGLETHREAD, SERIALIZED and MULTITHREAD is in how
> this is guaranteed.
>
> When the THREADSAFE parameter is set to SEARALIZED this means that the
> SQLite3 library itself will protect the connection data with a mutex to
> make sure that you do not violate this constraint by "serializing"
> simultaneous calls into the library on a single connection.  This means
> that if you make two simultaneous calls on the same connection (from
> different threads, for example), one of them will get the mutex and proceed
> immediately, and the other one will have to wait until that mutex is
> released, thus ensuring that the single-entrance requirement is met.
>
> When the THREADSAFE parameter is set to MULTITHREAD this means that the
> SQLite3 library WILL NOT protect the connection data with a mutex to
> protect you from violating this constraint, and that it is entirely and
> completely your responsibility to ensure that you do not violate the
> single-entrance (per connection) requirement.  If you do violate the single
> entrance requirement, you may corrupt the database, the library, the
> computer, the world, and cause the end of the universe.
>
> If and only if you are absolutely sure that you are complying with the
> single-entrance requirement THEN you can change the THREADSAFE mode from
> SERIALIZED to MULTITHREAD, which will save you a few nanoseconds per call
> into the library because the mutexes will no longer be checked.
>
> When the THREADSAFE is set to either SERIALIZED or MULTITHREADED the code
> to handle these mutexes is compiled into the library.  You can switch
> between these two modes at runtime.
>
> If you are only using one thread then you can set the THREADSAFE parameter
> to SINGLETHREAD which will cause the mutex code to be omitted entirely,
> saving you another nanosecond per call since you will not even need to
> "jump around" the mutex checking code.
>
> When you compile the library with THREADSAFE set to SINGLETHREAD then the
> mutex protection code IS NOT compiled into the library and therefore you
> cannot turn it on or off at runtime, since it does not exist.
>
> You are free to use as many threads as you like to call into the SQLite3
> library no matter what THREADSAFE mode is set at compile or runtime if you
> are absolutely sure that you are complying with the single-entrance per
> connection requirement.
>
> The THREADSAFE parameter merely sets the level of suspenders that you wish
> to have to protect against shoddy programming or non-deterministic
> languages (ie, those that do things such as asynchronous garbage collection
> or destructors).
>
> SINGLETHREAD means leave out the protection code altogether
> SERIALIZED means to include the code and use a mutex to protect against
> concurrent entry on a single connection
> MULTITHREAD means to include the code but "jump around it" so that it is
> not used
>
> >When SQLITE_THREADSAFE=(1, SERIALIZED), how are the reads/writes
> >queued? Is it per connection, file, or process?
>
> SQLITE_THREADSAFE has absolutely no impact on anything OTHER THAN
> protecting the connection data against multiple concurrent modification.
>
> I don't know exactly what you mean by "how are read/writes queued".  If
> you mean I/O, this is an Operating System issue and has nothing to do with
> SQLite3.  When the code requests to read data it issues a read to the
> Operating System.  When it wants to write, it issues a write to the
> Operating System.  How the Operating System carries out those operations is
> not within the purview of a user program.
>
> >What happens when a request is in the queue, does it just wait until
> >it can be actioned, or return SQLITE_BUSY?
>
> What do you mean by "request is in the quere"?  There is no queue.
>
> >On a 8 core machine, how would I get optimal read throughput from a
> >single database file?
>
> That depends on the speed of the I/O device and the speed of a core, and
> what processing you are doing between reads, what you are reading, the
> capabilities of the operating system, and the multiprogramming ratio
> achieved by your code.
>
> >If I have many database connections to the same file, can they all
> >read concurrently on all cores?
>
> Again this is an Operating System issue and is not a user program issue.
> However, generally speaking there is only one channel (physical set of
> wires) connecting the storage device containing the "database file" to the
> central processor of the computer.  This means that the I/O channel can
> only perform one operation at a time.  The Operating System may make it
> appear that multiple things are happening at once, but in reality they are
> not.  This is really an Operating System issue.  The SQLite3 library issues
> a read request to the operating system when it wants to read data, and a
> write request when it wants to write.  How your user program utilizes "the
> cores" is not under the control of SQLite3.  How the OS deals with read and
> write requests is an Operating System issue and is not within the control
> of a user program (nor of a library such as SQLite3 which merely requests
> the Operating System to perform the actual I/O).
>
> --
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
>
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

Olivier Mascia
> Le 6 août 2019 à 12:34, test user <[hidden email]> a écrit :
>
> So for example, if I had:
>
> - 8 cores
> - 8 threads
> - 8 db connections, 1 per thread
> - 1 database file
> - x amount of read requests per second
>
> If I were to load balance x requests over each of the 8 threads, all the
> reads would complete concurrently when in SERIALIZED mode, with WAL enabled?

Yes.

Also, if your software can guarantee, by its own design, that a db connection (and its dependencies, aka statements) will never ever be shared by multiple threads, (or other techniques like fibers in Windows for instance), you don't need the per connection mutex that SERIALIZED brings to you, and you can use MULTITHREADED.  But you can't run in SINGLETHREAD mode from a multi-threaded process.

> Im just trying to confirm that SERIALIZED will not queue up requests for (1
> file, multiple connections to that file, read only requests).

The SERIALIZATION in question here is a per connection (- data structures) matter. It isn't related to the concurrency of accesses to the db through distinct connections. But you do need WAL to achieve multiple readers concurrency, each in a BEGIN [DEFERRED] [TRANSACTION] which you will take care not to upgrade inadvertantly to a writer (by executing any statement which would imply writing to the database).  Each of the readers will see a stable view of the database content, as of their first read statement after BEGIN. And this will last until they COMMIT (or ROLLBACK as they are anyway supposed to be readers). If not using explicit transactions, each statement will run independently in an auto BEGIN [DEFERRED] / auto COMMIT transaction.

—  
Best Regards, Meilleures salutations, Met vriendelijke groeten, Mit besten Grüßen,
Olivier Mascia


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

Simon Slavin-3
In reply to this post by test user
On 6 Aug 2019, at 11:34am, test user <[hidden email]> wrote:

> - Does the second request just take longer to return from the FFI call whilst waiting for the mutex?

Yes.  You do not need to build backoff-and-retry code into your own software.  SQLite does it for you.  For every database connection you open, set a timeout using this:

<https://sqlite.org/c3ref/busy_timeout.html>

A setting of perhaps 30 seconds would be appropriate.  The SQLite API will use the amount of time you set to decide how long to keep retrying access before it gives up, assumes that the database is permanently locked, and returns an error code of SQLITE_BUSY.

> So for example, if I had:
>
> - 8 cores
> - 8 threads
> - 8 db connections, 1 per thread
> - 1 database file
> - x amount of read requests per second
>
> If I were to load balance x requests over each of the 8 threads, all the reads would complete concurrently when in SERIALIZED mode, with WAL enabled?

Please don't try to parallelize a part of your program which exists mostly to read or write to a SQLite database.  SQLite is not a client/server system.  Access to the database file on disk is a bottleneck.  If you have 8 threads which spend most of their time executing SQLite calls, all they will do is constantly block one-another.  You will get a far simpler program, far simpler to debug, if you do all your access through one thread using one connection.

What you /might/ want to do is have one thread which reads data from the database and keeps an in-memory buffer full of, say, the next thousand rows of data.  Then other simultaneous processing threads can read their data directly from memory.

However, once again this may not be appropriate.  You may find that your software executes at acceptable speed without any parallelization at all.  I suggest you try this before devoting your time to complicated programming.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

David Raymond
In reply to this post by Olivier Mascia
>"But you do need WAL to achieve multiple readers concurrency..."

Nope, you can have concurrent readers with rollback journal mode. You just can't have anyone writing while they're all reading.

(Or I may just be misunderstanding what you meant)

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

Olivier Mascia
> Le 6 août 2019 à 15:34, David Raymond <[hidden email]> a écrit :
>
>> "But you do need WAL to achieve multiple readers concurrency..."
>
> Nope, you can have concurrent readers with rollback journal mode. You just can't have anyone writing while they're all reading.
>
> (Or I may just be misunderstanding what you meant)

No David, you are perfectly right.  It just is that I was too concise in expressing my thoughts.  In my book, short of very specific specialized needs, the prospect of some process involving multiple concurrent readers, which tend to imply that there are very often one or many readers at any one time, without much place for a writer to intervene, if only on occasion, without disrupting the flow of readers, is not very welcome.  That's why SQLite makes sense for me (my needs), albeit in WAL mode only. I'm so bathed in WAL all day long, that I tend to forget the details when not using WAL.

You did well to correct my assertion which without context, is misleading, another word for wrong. Thanks. :)

—  
Best Regards, Meilleures salutations, Met vriendelijke groeten, Mit besten Grüßen,
Olivier Mascia

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

test user
In reply to this post by Simon Slavin-3
Quote: Please don't try to parallelize a part of your program which exists
mostly to read or write to a SQLite database.  SQLite is not a
client/server system.  Access to the database file on disk is a
bottleneck.  If you have 8 threads which spend most of their time executing
SQLite calls, all they will do is constantly block one-another.


I think this is true of writes.

But I dont think it is true of reads.


Given an immutable SQLite file (read only), this can be safely shared among
different threads and processes.

I think distributing reads among cores is in the class of problem called
"embarrassingly parallel":

https://en.wikipedia.org/wiki/Embarrassingly_parallel





Quote: Access to the database file on disk is a bottleneck.

https://www.amazon.com/Samsung-970-EVO-1TB-MZ-V7E1T0BW/dp/B07BN217QG/

This NVMe SSD disk has a read speed of 3GB/s.

If I distribute readers across 8 CPU cores so they all compute concurrently
(without blocking each other), and they all use a tiny portion of that disk
read bandwidth, surely they would be 8x faster than using a single core?


On Tue, Aug 6, 2019 at 1:42 PM Simon Slavin <[hidden email]> wrote:

> On 6 Aug 2019, at 11:34am, test user <[hidden email]> wrote:
>
> > - Does the second request just take longer to return from the FFI call
> whilst waiting for the mutex?
>
> Yes.  You do not need to build backoff-and-retry code into your own
> software.  SQLite does it for you.  For every database connection you open,
> set a timeout using this:
>
> <https://sqlite.org/c3ref/busy_timeout.html>
>
> A setting of perhaps 30 seconds would be appropriate.  The SQLite API will
> use the amount of time you set to decide how long to keep retrying access
> before it gives up, assumes that the database is permanently locked, and
> returns an error code of SQLITE_BUSY.
>
> > So for example, if I had:
> >
> > - 8 cores
> > - 8 threads
> > - 8 db connections, 1 per thread
> > - 1 database file
> > - x amount of read requests per second
> >
> > If I were to load balance x requests over each of the 8 threads, all the
> reads would complete concurrently when in SERIALIZED mode, with WAL enabled?
>
> Please don't try to parallelize a part of your program which exists mostly
> to read or write to a SQLite database.  SQLite is not a client/server
> system.  Access to the database file on disk is a bottleneck.  If you have
> 8 threads which spend most of their time executing SQLite calls, all they
> will do is constantly block one-another.  You will get a far simpler
> program, far simpler to debug, if you do all your access through one thread
> using one connection.
>
> What you /might/ want to do is have one thread which reads data from the
> database and keeps an in-memory buffer full of, say, the next thousand rows
> of data.  Then other simultaneous processing threads can read their data
> directly from memory.
>
> However, once again this may not be appropriate.  You may find that your
> software executes at acceptable speed without any parallelization at all.
> I suggest you try this before devoting your time to complicated programming.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

Keith Medcalf
In reply to this post by test user

On Tuesday, 6 August, 2019 04:35, test user <[hidden email]> wrote:

>When SQLITE_THREADSAFE=(1, SERIALIZED), how are the reads/writes
>queued? Is it per connection, file, or process?

>Quote: I don't know exactly what you mean by "how are read/writes
>queued".

>Quote: if you make two simultaneous calls on the same connection
>(from different threads, for example), one of them will get the mutex and
>proceed immediately, and the other one will have to wait until that mutex is
>released

>What I mean by "queue" is that if one request is waiting until a
>mutex is released, it is in an implicit queue for that resource.

>I wanted to know how this queue works:

>- Does the second request just take longer to return from the FFI
>call whilst waiting for the mutex?

>- Or does the FFI return SQLITE_BUSY and expect the caller to call
>again in the future?

Being that this is an Operating System provided MUTEX I would expect that the subsequent entry merely waits in the OS queue until it is able to obtain the mutex.  There is a #define that you can make at compile time (SQLITE_ENABLE_MULTITHREADED_CHECKS) that will chuck an message to the log if you use MULTITHREADED and then violate the rules anyway.

Without looking though the code in detail the above is my expectation, however, I suppose you could get an SQLITE_BUSY / SQLITE_MISUSE return if you attempt something that you should not be doing, like trying to do something simultaneously from multiple threads on the same connection.

Only one of the devs familliar with the mutex system everywhere it is used would know the answer to this without either (a) experimenting or (b) searching widely through the code looking for how each particular instance of a mutex is handled.

However, since the default is SERIALIZED I would expect that the SERIALIZATION occurs without notification to the programmer that it has been done.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: THREADSAFE (1, SERIALIZED) vs (2, MULTITHREAD)

Keith Medcalf
In reply to this post by test user




--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


On Tuesday, 6 August, 2019 04:35, test user <[hidden email]> wrote:

>So in summary, there is no difference in the multi threaded
>performance that can be gained between SERIALIZED and MULTITHREADED
>(aside from the mutex overhead)? The only difference is SERIALIZED
>enforces correct usage at a small overhead cost?

That is my understanding, yes.  However, the cost of obtaining/checking/releasing a MUTEX may vary by OS implementation so the definition of "small" depends on the underlying implementation.

>So for example, if I had:

>- 8 cores
>- 8 threads
>- 8 db connections, 1 per thread
>- 1 database file
>- x amount of read requests per second

>If I were to load balance x requests over each of the 8 threads, all
>the reads would complete concurrently when in SERIALIZED mode, with WAL
>enabled?

Yes.  Within the abilities of the OS to concurrently perform any required I/O (including I/O from the cache).  That is that as far as the library is concerned they would all operate (compute) in parallel.  Whether or not something else (at the OS or hardware level) may impose overheads or serialization is not something that user code can guarantee.

>Assume other bottlenecks in the system are not an issue (like disk
>speed).

>Im just trying to confirm that SERIALIZED will not queue up requests
>for (1 file, multiple connections to that file, read only requests).

No, SERIALIZED serializes requests for CPU access to shared data located in memory to prevent concurrent access to that data by multiple threads.  Once the mutex is obtained any further serialization is an OS concurrency issue handled by the OS.

Be aware, however, that things like using SHARED_CACHE may have extra serialization between connections to the same shared cache to prevent simultaneous access to the cache data structures by different threads running on different connections, even though those connections do not have multiple simultaneous call contention on their own.

So basically, if you executed 8 SELECT statements, each on a separate connection, each on a separate core, with no I/O limitations (that is, the entire database was contained in the OS block cache), and no memory limitations, and each connection using its own cache per connection (no shared cache) you should expect the limiting factor to be CPU only.  You will note that in that scenario there is not really much difference between using one process with 8 threads and one connection per thread, one thread per core each executing one SELECT each, and 8 processes each having one thread executing the same select statements one per process where each process is being dispatched to each core.  Except any overhead that the OS might impose for thread/process handling, that is.





_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users