ADO.NET driver upgrade puzzle

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

ADO.NET driver upgrade puzzle

Roberts, Barry (FINTL)
Hi,

We have a C# application which logs data in real-time to sqlite files. The application is multi-threaded, and typically logs 200-500 data streams simultaneously. The logging rate per stream is 1Hz to 1000Hz, depending on the type of data coming in. Each data stream is logged to its own db3 file. This means that there are generally 200-500 connections active, and we run with connection pooling on.  Each data stream typically buffers for 2-3 seconds then writes the buffer to the respective db3 file. All processing with the application is multi-threaded.

I am currently looking at updating the version of System.Data.SQLite we are running, but running into some issues. The application currently runs 1.0.80.0, and I am testing with 1.0.105.1, the code changes to our app were fairly minimal. The reason we were looking to upgrade is that performance profiling was showing a high degree of lock contention within the old driver code, primarily due to a static lock in the old driver. The newer driver uses locks based on the connection handle, which removes the lock contention we were seeing, which leads to a significant improvement in CPU usage.

When logging with the old driver there are no issues, other than the lock contention.
When logging with the new driver, the system sometimes locks up, or marks one or more of the db3 as malformed.

After much testing, this seems to occur when we prepare statements on the sqlite command. If I disable the IDbCommand.Prepare() calls, the driver appears to be stable. Structurally our code is fairly standard, and makes extensive use of "using" blocks. We only hold the connection and transaction open for the minimal time possible, releasing the connection back to the pool.

Using (connection)
Open connection
Using (begin transaction)
Using (create command)
Prepare insert command
Transfer data
Commit transaction

One of the stability tests I run involves 2 processes logging to the same data sets (db3) files, because we do have multi-process access onto the db3 files. Connections are configured to use journal mode "persist".

Should we not use prepare calls when running connection pooling, or do you have any other recommendations we should take note of.

Kind Regards,
Barry Roberts.

[hidden email]<mailto:[hidden email]> | www.fugro.com


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Clemens Ladisch
Roberts, Barry (FINTL) wrote:
> The application is multi-threaded
>
> the system sometimes locks up, or marks one or more of the db3 as malformed.

Sounds like a threading problem.

You should not have more than one thread accessing the same connection
concurrently.

> Using (connection)
> Open connection

What is the relationship between threads and databases?  Are you really
re-opening the connection for every transfer?


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Darko Volaric
That's not strictly true, it depends on the threading mode: https://sqlite.org/threadsafe.html <https://sqlite.org/threadsafe.html>

It sounds like the driver is trying to manage concurrency on its own and failing. For the purposes of calling sqlite, if the library is used in the default "serialized" mode, then no locks at all should be necessary, but maybe the locks are required for something happening in the driver code. If you check which threading mode sqlite is running in you''ll either need to change the threading mode to serialized, and if that's already the case or changing to that doesn't fix it then the threading problem is in the driver code itself.


> On Sep 25, 2017, at 11:33 AM, Clemens Ladisch <[hidden email]> wrote:
>
> Roberts, Barry (FINTL) wrote:
>> The application is multi-threaded
>>
>> the system sometimes locks up, or marks one or more of the db3 as malformed.
>
> Sounds like a threading problem.
>
> You should not have more than one thread accessing the same connection
> concurrently.
>
>> Using (connection)
>> Open connection
>
> What is the relationship between threads and databases?  Are you really
> re-opening the connection for every transfer?
>
>
> Regards,
> Clemens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Roberts, Barry (FINTL)
In reply to this post by Roberts, Barry (FINTL)
Hi Clemens,

As per my original post, all C# access code is making extensive use of "using" statements. However we do obviously rely on the connection pool being thread safe, because many threads are writing to different databases (connections) concurrently.

There is no direct relationship between threads and databases. However I do enforce that within a process only one thread can be writing to a specific database (because sqlite does not support parallel writing) at a time. All threads are running on the Pfx thread pool. As data is received from a device it is assigned to a specific data stream. If the buffer for the stream is then full a write action is assigned to the thread pool. When that runs it will call into the ADO.NET code, and send the data to the database for that connection. Here is some test code, that shows how we access the connections normally. Imagine that the "Insert" method is called whenever we have data to flush for a specific database, and there are normally 200-500 database connections active. Obviously at the end of the using statements the connection is effectively handed back to the connection pool, which is outside of my direct control.

public void Insert([NotNull] IEnumerable<T> records)
        {
            ArgumentHelper.AssertNotNull(records, "records");
            var sb = new StringBuilder();

            sb.AppendLine("INSERT INTO");
            sb.AppendLine(m_TableName);
            sb.AppendLine("(identity, time, binary)");
            sb.AppendLine("VALUES");
            sb.AppendLine("(@identity, @time, @binary)");

            using (IDbConnection connection = CreateConnection())
            {
                connection.Open();

                using (IDbTransaction transaction = connection.BeginTransactionSerializable())
                {
                    using (IDbCommand command = CreateCommand(connection))
                    {
                        command.Transaction = transaction;
                        command.CommandText = sb.ToString();
                        command.Parameters.Add(new SQLiteParameter("@identity", DbType.Int32));
                        command.Parameters.Add(new SQLiteParameter("@time",     DbType.Int64));
                        command.Parameters.Add(new SQLiteParameter("@binary",   DbType.Binary));

                        if (m_Factory.Configuration.UsePreparedStatements)
                        {
                            command.Prepare();
                        }

                        var stream = new ByteStream();

                        IDataParameterCollection parameters = command.Parameters;
                        IDataParameter parameter01 = ((IDbDataParameter) parameters[1]);
                        IDataParameter parameter02 = ((IDbDataParameter) parameters[2]);

                        foreach (var record in records)
                        {
                            stream.Clear();
                            record.Serialize(stream);

                            parameter01.Value = record.TimeOfValidity.TaiTicks;
                            parameter02.Value = stream.GetContents();
                            command.ExecuteNonQuery();
                        }
                    }

                    try
                    {
                        transaction.Commit();
                    }
                    catch
                    {
                        transaction.Rollback();
                        throw;
                    }
                }
            }
        }

        [NotNull]
                               public IDbConnection CreateConnection()
                               {
                                   var connection = new SQLiteConnection
                                   {
                                       ConnectionString = m_Builder.ConnectionString
                                   };

                                   return connection;
                               }

        [NotNull]
        public IDbCommand CreateCommand([NotNull] IDbConnection connection)
                               {
                                               IDbCommand command = new SQLiteCommand();
                                               command.Connection = connection.NotNull("connection");
                                               return command;
                               }



Kind Regards,
Barry Roberts.

[hidden email]<mailto:[hidden email]> | www.fugro.com


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Darko Volaric
In reply to this post by Roberts, Barry (FINTL)
If you're accessing a database concurrently from different processes then you may want to look at this: https://sqlite.org/faq.html#q5 <https://sqlite.org/faq.html#q5> as your file system may be causing you problems.


> On Sep 25, 2017, at 11:09 AM, Roberts, Barry (FINTL) <[hidden email]> wrote:
>
> One of the stability tests I run involves 2 processes logging to the same data sets (db3) files, because we do have multi-process access onto the db3 files.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Clemens Ladisch
In reply to this post by Roberts, Barry (FINTL)
Roberts, Barry (FINTL) wrote:
> As per my original post, all C# access code is making extensive use of
> "using" statements. However we do obviously rely on the connection pool
> being thread safe, because many threads are writing to different
> databases (connections) concurrently.
>
> There is no direct relationship between threads and databases. However
> I do enforce that within a process only one thread can be writing to
> a specific database (because sqlite does not support parallel writing)
> at a time.

How exactly are you enforcing that?

> public IDbConnection CreateConnection()
> {
>     var connection = new SQLiteConnection
>     {
>         ConnectionString = m_Builder.ConnectionString
>     };
>
>     return connection;
> }

I would be tempted to replace that ConnectionString variable with the
actual connection object, and not using the connection pool.  (Assuming
that the builder objects are not shared, or properly locked.)


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Barry Smith


> On 26 Sep 2017, at 12:14 am, Clemens Ladisch <[hidden email]> wrote:
>
> Roberts, Barry (FINTL) wrote:
>> As per my original post, all C# access code is making extensive use of
>> "using" statements. However we do obviously rely on the connection pool
>> being thread safe, because many threads are writing to different
>> databases (connections) concurrently.
>>
>> There is no direct relationship between threads and databases. However
>> I do enforce that within a process only one thread can be writing to
>> a specific database (because sqlite does not support parallel writing)
>> at a time.
>
> How exactly are you enforcing that?
>
>> public IDbConnection CreateConnection()
>> {
>>    var connection = new SQLiteConnection
>>    {
>>        ConnectionString = m_Builder.ConnectionString
>>    };
>>
>>    return connection;
>> }
>
> I would be tempted to replace that ConnectionString variable with the
> actual connection object, and not using the connection pool.  (Assuming
> that the builder objects are not shared, or properly locked.)
>
>
> Regards,
> Clemens
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Roberts, Barry (FINTL)
In reply to this post by Roberts, Barry (FINTL)
Hi,

I am aware of the sqlite FAQ, and especially the comments regarding file systems. We only ever use NTFS file system, and never have any FAT or FAT32 file systems. Given that the old 1.0.80.0 driver is totally stable, and the newer one is not, on the same exact hardware, I do not think the issue is hardware related.

I enforce that the writer to a specific database can only run 1 at a time, this is done via the C# Interlocked mechanism which has always proved to be very robust. There is a separate logger object instance per database, and within each of those the writing is controlled with the interlock. This allows the application to buffer for longer if the writing to disk encounters a short term slow down.

As per Clemens suggestion I could replace the use of the connection pool and hold a connection open per database. The code can be structured to do that, because object instances are not shared between database logger instances. However I would have preferred to use the connection pool, because that allows me to open late/close early on my connection usage, which is generally better for maintenance and how the production code is currently strutured. I could restructure the code, but that is not really desirable in production code, simply due to an ADO.NET driver update.

I have been running my test application for days now without issues. This is configured with pooling on, statement preparation off. If I enable statement preparation the system will fail after an hour or two. So it looks like our production code needs to run the same, and remove the IDbCommand.Prepare() calls. The old driver works well but has lots of lock contention, the new one resolves that issue, but is not as stable in some configurations. It is a trade-off, I need the new driver to improve performance, but will trade off the command prepare calls, to enable driver stability.

Kind Regards,
Barry Roberts.
[hidden email]<mailto:[hidden email]> | www.fugro.com


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: ADO.NET driver upgrade puzzle

Barry Smith
Quite strange that Prepare() makes a difference. The system.data.sqlite documentation states that Prepare() does nothing, and a code inspection of the system.data.sqlite source shows that it does nothing but check that the command hasn't been disposed, and that the connection is still valid (which is done at the start of almost every method in the class, and therefore again when you call ExecuteNonQuery.)

> On 27 Sep 2017, at 5:07 am, Roberts, Barry (FINTL) <[hidden email]> wrote:
>
> Hi,
>
> I am aware of the sqlite FAQ, and especially the comments regarding file systems. We only ever use NTFS file system, and never have any FAT or FAT32 file systems. Given that the old 1.0.80.0 driver is totally stable, and the newer one is not, on the same exact hardware, I do not think the issue is hardware related.
>
> I enforce that the writer to a specific database can only run 1 at a time, this is done via the C# Interlocked mechanism which has always proved to be very robust. There is a separate logger object instance per database, and within each of those the writing is controlled with the interlock. This allows the application to buffer for longer if the writing to disk encounters a short term slow down.
>
> As per Clemens suggestion I could replace the use of the connection pool and hold a connection open per database. The code can be structured to do that, because object instances are not shared between database logger instances. However I would have preferred to use the connection pool, because that allows me to open late/close early on my connection usage, which is generally better for maintenance and how the production code is currently strutured. I could restructure the code, but that is not really desirable in production code, simply due to an ADO.NET driver update.
>
> I have been running my test application for days now without issues. This is configured with pooling on, statement preparation off. If I enable statement preparation the system will fail after an hour or two. So it looks like our production code needs to run the same, and remove the IDbCommand.Prepare() calls. The old driver works well but has lots of lock contention, the new one resolves that issue, but is not as stable in some configurations. It is a trade-off, I need the new driver to improve performance, but will trade off the command prepare calls, to enable driver stability.
>
> Kind Regards,
> Barry Roberts.
> [hidden email]<mailto:[hidden email]> | www.fugro.com
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users