can ignore n lines when .import data file?

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

can ignore n lines when .import data file?

ronggui wong
I want to use .import to import a data file into SQLite.but the first
line is the variable(fields),so I want to ignore it.delete the first
row of the data is one solution,But the data file is large ,and try to
use sed to delete the first row,it takes long time
a|b|c
1|2|3
|3|1
...
Reply | Threaded
Open this post in threaded view
|

RE: can ignore n lines when .import data file?

Griggs, Donald
 
Regarding:

I want to use .import ... but the first line is the variable(fields), so I
want to ignore it.
=================================

Possible option #1:

If it wouldn't violate any database-enforced constraints, why not import the
entire file, then delete the first row via SQL?
   DELETE FROM mytable WHERE ROW_ID=1;


Possible option #2:

Are you using windows?  If not, and if you weren't using SED in a piped
command (i.e., if you were copying the data via SED to a new file), then
changing to a piped command might lessen the time.
(Under windows, I believe a full file copy to a temporary file occurs with
piped commands, so I would expect no advantage.)


Possible option #3:

You could instead modify the source of the command-line tool, of course.



Donald Griggs


Opinions are not necessarily those of Misys Healthcare Systems nor its board
of directors.
Reply | Threaded
Open this post in threaded view
|

Re: can ignore n lines when .import data file?

Jay Sprenkle
In reply to this post by ronggui wong
Have you tried "head" or "tail"?

On 1/6/06, ronggui wong <[hidden email]> wrote:
> I want to use .import to import a data file into SQLite.but the first
> line is the variable(fields),so I want to ignore it.delete the first
> row of the data is one solution,But the data file is large ,and try to
> use sed to delete the first row,it takes long time
> a|b|c
> 1|2|3
> |3|1
Reply | Threaded
Open this post in threaded view
|

Re: can ignore n lines when .import data file?

ronggui wong
In reply to this post by Griggs, Donald
2006/1/7, Griggs, Donald <[hidden email]>:

>
> Regarding:
>
> I want to use .import ... but the first line is the variable(fields), so I
> want to ignore it.
> =================================
>
> Possible option #1:
>
> If it wouldn't violate any database-enforced constraints, why not import the
> entire file, then delete the first row via SQL?
>   DELETE FROM mytable WHERE ROW_ID=1;

> Possible option #2:
>
> Are you using windows?  If not, and if you weren't using SED in a piped
> command (i.e., if you were copying the data via SED to a new file), then
> changing to a piped command might lessen the time.
> (Under windows, I believe a full file copy to a temporary file occurs with
> piped commands, so I would expect no advantage.)
>

I am using windows.

> Possible option #3:
>
> You could instead modify the source of the command-line tool, of course.
>
>
>
> Donald Griggs
>
>
> Opinions are not necessarily those of Misys Healthcare Systems nor its board
> of directors.
>
Reply | Threaded
Open this post in threaded view
|

Re: can ignore n lines when .import data file?

ronggui wong
In reply to this post by Jay Sprenkle
2006/1/7, Jay Sprenkle <[hidden email]>:
> Have you tried "head" or "tail"?
do you means the head command of the OS,say linux ?
Not yet,using "head" or "tail" will be faster than sed?

As I see,many CSV files using \t as separator.
And SQLite does not use \t,I have to use sed(or other tools) to change
Tab to |. so My puzzle is why SQLite not support \t?


> On 1/6/06, ronggui wong <[hidden email]> wrote:
> > I want to use .import to import a data file into SQLite.but the first
> > line is the variable(fields),so I want to ignore it.delete the first
> > row of the data is one solution,But the data file is large ,and try to
> > use sed to delete the first row,it takes long time
> > a|b|c
> > 1|2|3
> > |3|1
>
Reply | Threaded
Open this post in threaded view
|

Re: can ignore n lines when .import data file?

Jay Sprenkle
> > Have you tried "head" or "tail"?
> do you means the head command of the OS,say linux ?
> Not yet,using "head" or "tail" will be faster than sed?

My guess would be yes, though probably not much.
It should be pretty simple to test.
If it's a very large file you may be limited by your I/O speed, not the program.
Reply | Threaded
Open this post in threaded view
|

Re: can ignore n lines when .import data file?

Nemanja Corlija
In reply to this post by ronggui wong
On 1/6/06, ronggui wong <[hidden email]> wrote:
> As I see,many CSV files using \t as separator.
> And SQLite does not use \t,I have to use sed(or other tools) to change
> Tab to |. so My puzzle is why SQLite not support \t?
You can set separator to be anything you want, including \t.
Use this command before importing your \t delimited file and you can
skip the sed step then.

.separator "\t"

--
Nemanja Corlija <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: can ignore n lines when .import data file?

ronggui wong
In reply to this post by Griggs, Donald
2006/1/7, Griggs, Donald <[hidden email]>:

>
> Regarding:
>
> I want to use .import ... but the first line is the variable(fields), so I
> want to ignore it.
> =================================
>
> Possible option #1:
>
> If it wouldn't violate any database-enforced constraints, why not import the
> entire file, then delete the first row via SQL?
>   DELETE FROM mytable WHERE ROW_ID=1;
I have tried and works fine for me.
one more thing,is seems ROWID instead of ROW_ID.I use sqlite 3.2.8.

>
> Possible option #2:
>
> Are you using windows?  If not, and if you weren't using SED in a piped
> command (i.e., if you were copying the data via SED to a new file), then
> changing to a piped command might lessen the time.
> (Under windows, I believe a full file copy to a temporary file occurs with
> piped commands, so I would expect no advantage.)
>
>
> Possible option #3:
>
> You could instead modify the source of the command-line tool, of course.
>
>
>
> Donald Griggs
>
>
> Opinions are not necessarily those of Misys Healthcare Systems nor its board
> of directors.
>
Reply | Threaded
Open this post in threaded view
|

proposal for improving concurrency in SQLite

Shailesh N. Humbad
I wrote an article suggesting a method for improving concurrency in
SQLite.  The main reason I wrote this article is so I can understand
SQLite better, not because my proposal is new or usable.  I am not an
expert in SQLite or databases, so I would appreciate any comments or
criticisms.  Here is the link:

http://www.somacon.com/p369.php
Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Andrew Piskorski
On Sat, Jan 07, 2006 at 12:33:08AM -0500, Shailesh N. Humbad wrote:
> I wrote an article suggesting a method for improving concurrency in
> SQLite.  The main reason I wrote this article is so I can understand
> SQLite better, not because my proposal is new or usable.  I am not an
> expert in SQLite or databases, so I would appreciate any comments or
> criticisms.  Here is the link:
>
> http://www.somacon.com/p369.php

Your "Prior SQLite Concurrency Proposals" section is deficient, I
recommend a lot more searching in the SQLite list archives.  Read at
least the threads which contain these messages:

  http://www.mail-archive.com/sqlite-users@.../msg01916.html
  http://www.mail-archive.com/sqlite-users@.../msg02270.html
  http://www.mail-archive.com/sqlite-users@.../msg02267.html

Particularly interesting were Doug Currie's April 2004 ideas for
achieving MVCC semantics with table level locking for writes, by
integrating shadow paging into the BTree layer:

  http://www.mail-archive.com/sqlite-users@.../msg01935.html
  http://www.mail-archive.com/sqlite-users@.../msg01982.html

--
Andrew Piskorski <[hidden email]>
http://www.piskorski.com/
Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

D. Richard Hipp
In reply to this post by Shailesh N. Humbad
"Shailesh N. Humbad" <[hidden email]> wrote:
> I wrote an article suggesting a method for improving concurrency in
> SQLite.  The main reason I wrote this article is so I can understand
> SQLite better, not because my proposal is new or usable.  I am not an
> expert in SQLite or databases, so I would appreciate any comments or
> criticisms.  Here is the link:
>
> http://www.somacon.com/p369.php
>

This is a well-written paper.  The background sections clearly
articulate the limitations of SQLite and provide a very fair
comparison between SQLite and other "embedded" database engines,
particularly Microsoft Jet.  The paper gives a good and accurate
overview of how SQLite works.

I have some issues with the proposed fine-grain locking
enhancement, however.

The paper proposes a scheme for allowing multiple processes to
write to the same database simultaneously as long as each process
modifies a separate table.  The paper argues that each table in
an SQLite database is a logically distinct and separate entity that
does not interact with other tables.  So multiple writers, each
with their own rollback journal, could write to the file at the
same time as long as there is no overlap in what is written.

This is true as far as it goes.  The problem is that tables
are not quite as distinct and separate as they appear at first
glance.  There are attributes that are shared by all tables
in the database file.  The first process to begin modifying
any table would need to lock all of these shared resources
to prevent any overlap.  And since no other table can be
modified without changes to the same shared resources, you
have therefore essentially locked the entire database.

Resources shared among all tables include:

   *  The freelist of unused pages.  As the size of tables
      increases and decreases, pages are taken from and
      returned to the common freelist.

   *  The size of the database file.  Strange as it may
      seem, the file size of the database file is part of
      the state information of the database.  That information
      is needed, for example, when the freelist is empty and
      it becomes necessary to allocate new pages from the
      end of the file.  Imagine the complications if two
      separate writers attempted to extend the file at the
      same time.

   *  In autovacuum mode, there are index pages scattered
      about through the database file that are shared by
      all tables.

One might argue that the shared resources above only come into
play when the size of a table increases or decreases.  One could,
in theory, implement a scheme where two or more simultaneous
writers could coexists on separate tables as long as no more than
one was adding and removing information and all the others were
merely changing information.  But in practice this does not seem
to buy you very much.  In SQLite, it is difficult to know when
you are adding or removing information versus merely changing
existing information.  SQLite uses variable-length records.
So changing the value of an integer field from 8388607 to 8388608
causes the size of the record to grow by two bytes, for example.
Changing the value back causes the record to shrink again.
If the page on which that record resided was nearly full or
less than half full, then the change might cause the btree balancing
algorithm to try to split the page or merge it with its neighbors.
As another example, consider changing a text field "abc" to
"xyz".  The record size stays the same since the length of the
text is the same.  But if the column is indexed, the index entry
would need to move from one part of the index to another.  This
might cause compaction in the part of the index where the record
was moving from and expansion in the part of the index where the
record is moving to.  And either operation involves allocating
and deallocating pages which in turn requires access to
shared resources.  

We could back off even further and allow multiple database
readers to continue while one process was writing to a separate
table.  This could be done, but only with an substantial increase
in code complexity and size and a performance hit which is
probably noticable.  And the current locking scheme which holds
writes in memory until the last possible moment comes close
to giving you this already, so the gains are minimal.  And,
as the author of the paper notes, you can already accomplish
the same thing by putting the various tables in separate
database files then using ATTACH to bring them all into the
same database connection.

--
D. Richard Hipp <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Alexander J. Kozlovsky
Hello!

I think, there is another way for high-concurrency SQLite-based systems.

**It is not entirely universal**, but I hope it may be used for
high-traffic web sites and similar kind of systems, where each individual
transaction (such as a page retrieval or a form submission) is very
short and can be replayed automatically in the case of transaction
failure.

The main idea is in additional abstraction layer on the top of
SQLite. This layer is high level transaction management system.
Each high-level optimistic transaction is (almost) fully ACID,
but many high-level transactions executes in the scope of single
low-level SQLite transaction. This way the count of file buffers
flushing is decreased significantly.

Below is very simplified example. Let imagine high-level transaction:

    BEGIN
      SELECT ...
      -- potentially slow processing occurred here
      SELECT ...
      -- potentially slow processing occurred here
      INSERT ...
      UPDATE ...
      DELETE ...
    COMMIT

This transaction in many cases can be split into two independent
transactions:
1. Read-only transaction
2. Read-write optimistic transaction with conflict detection

    BEGIN
      SELECT ...
      -- potentially slow processing occurred here
      SELECT ...
      -- potentially slow processing occurred here
    COMMIT

    BEGIN
      SELECT ... -- additional short SELECTs for conflict detection
      -- ROLLBACK if some conflict was detected
      INSERT ...
      UPDATE ...
      DELETE ...
    COMMIT

Because the first transaction is read-only, it may be performed
in parallel with similar transactions. That is, no locking
conflicts is occurred on this phase.

Because the second transaction contains optimistic conflict
detection, it can be combined with many similar transactions
into one low-level SQLite transaction. This way, the count of
file buffers flushing is reduced greatly. This transaction
will be relatively short because no application-specific slow
processing take place on that phase.

The main trick is how to split the original transaction
automatically. It may be very hard if original transaction
is expressed directly in SQL. I hope, such transformation can be
performed much more easily if operations are expressed in terms
of some object-relational mapping system. Such ORM can begin
read-only SQLite transaction and then delay all the modifications
till the end of high-level transaction. If application code
will try execute direct SQL data modification query, then
the read-only transaction will degrade smoothly to unbatched
read-write transaction.
   
On typical web-site the majority of all transactions is read-only.
If the web site must perform 1000 read-only transactions per second
and 20 read-write transactions per second, then this 20 read-write
transactions can be slightly delayed and then processed in the
batch as single low-level SQLite transaction.

But what if optimistic locking conflict is detected in one
of this 20 transactions? Does it means the others 19 high-level
transaction must be rolled back, as well as low-level SQLite
transaction? No, it is not. Only transaction with conflicts
will be excluded from low-level SQLite transaction, and then
the entire process of this HTTP request will be reiterated
by the web-server.

In the near future I'll try to implement SQLite-oriented ORM
which will demonstrate this approach.


Best regards,
 Alexander                            mailto:[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

John Stanton-3
Alexander,

I like your general concept.  Some years ago I implemented a somewhat
similar strategy in a product of ours which was SQLite-like in that it
linked into each application process and managed B-Trees.  By
identifying read-only transactions and handling them in a simple manner
compared to read/write a high degree of concurrency was achieved in
practice, with thousands of concurrent users achieving adequate
responses.  As you point out typical applications are read not write
intensive.

Alexander Kozlovsky wrote:

> Hello!
>
> I think, there is another way for high-concurrency SQLite-based systems.
>
> **It is not entirely universal**, but I hope it may be used for
> high-traffic web sites and similar kind of systems, where each individual
> transaction (such as a page retrieval or a form submission) is very
> short and can be replayed automatically in the case of transaction
> failure.
>
> The main idea is in additional abstraction layer on the top of
> SQLite. This layer is high level transaction management system.
> Each high-level optimistic transaction is (almost) fully ACID,
> but many high-level transactions executes in the scope of single
> low-level SQLite transaction. This way the count of file buffers
> flushing is decreased significantly.
>
> Below is very simplified example. Let imagine high-level transaction:
>
>     BEGIN
>       SELECT ...
>       -- potentially slow processing occurred here
>       SELECT ...
>       -- potentially slow processing occurred here
>       INSERT ...
>       UPDATE ...
>       DELETE ...
>     COMMIT
>
> This transaction in many cases can be split into two independent
> transactions:
> 1. Read-only transaction
> 2. Read-write optimistic transaction with conflict detection
>
>     BEGIN
>       SELECT ...
>       -- potentially slow processing occurred here
>       SELECT ...
>       -- potentially slow processing occurred here
>     COMMIT
>
>     BEGIN
>       SELECT ... -- additional short SELECTs for conflict detection
>       -- ROLLBACK if some conflict was detected
>       INSERT ...
>       UPDATE ...
>       DELETE ...
>     COMMIT
>
> Because the first transaction is read-only, it may be performed
> in parallel with similar transactions. That is, no locking
> conflicts is occurred on this phase.
>
> Because the second transaction contains optimistic conflict
> detection, it can be combined with many similar transactions
> into one low-level SQLite transaction. This way, the count of
> file buffers flushing is reduced greatly. This transaction
> will be relatively short because no application-specific slow
> processing take place on that phase.
>
> The main trick is how to split the original transaction
> automatically. It may be very hard if original transaction
> is expressed directly in SQL. I hope, such transformation can be
> performed much more easily if operations are expressed in terms
> of some object-relational mapping system. Such ORM can begin
> read-only SQLite transaction and then delay all the modifications
> till the end of high-level transaction. If application code
> will try execute direct SQL data modification query, then
> the read-only transaction will degrade smoothly to unbatched
> read-write transaction.
>    
> On typical web-site the majority of all transactions is read-only.
> If the web site must perform 1000 read-only transactions per second
> and 20 read-write transactions per second, then this 20 read-write
> transactions can be slightly delayed and then processed in the
> batch as single low-level SQLite transaction.
>
> But what if optimistic locking conflict is detected in one
> of this 20 transactions? Does it means the others 19 high-level
> transaction must be rolled back, as well as low-level SQLite
> transaction? No, it is not. Only transaction with conflicts
> will be excluded from low-level SQLite transaction, and then
> the entire process of this HTTP request will be reiterated
> by the web-server.
>
> In the near future I'll try to implement SQLite-oriented ORM
> which will demonstrate this approach.
>
>
> Best regards,
>  Alexander                            mailto:[hidden email]
>

Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Eduardo Morras
In reply to this post by D. Richard Hipp
At 15:27 07/01/2006, [hidden email] wrote:
>"Shailesh N. Humbad" <[hidden email]> wrote:
> > I wrote an article suggesting a method for improving concurrency in
> > SQLite.  The main reason I wrote this article is so I can understand
> > SQLite better, not because my proposal is new or usable.  I am not an
.......
>  And, as the author of the paper notes, you can already accomplish
>the same thing by putting the various tables in separate
>database files then using ATTACH to bring them all into the
>same database connection.
>--
>D. Richard Hipp <[hidden email]>

Can any part of the backend, btree, pager or os interface; separate
each table in multiple files automatically? Can those files reside on
2 or more disks (memory based or hd based)? This way the sqlite
problem :(cite from "[hidden email]" Subject:"Re: [sqlite] Slowness
with in-memory database" 21/12/2005)

Transactions are slow due to limitations of computer disk hardware.
When SQLite writes to the disk, it has to stop and wait at a couple of
places to for all of the data to actually be written to the disk
surface.  This is necessary in case a power failure or OS crash occurs -
so that the data can be recovered.  It is this stopping and waiting that
takes so long.  If we didn't need to wait for the disk platter to
spin underneath the head, SQLite could easily handle tens of thousands
of transactions per second.  But because each transaction requires at
least two revolutions of the disk platter, SQLite is limited to about
60 transactions per second.  (Do the math:  7200 RPM disk drive means
120 revolutions per second.  2 complete revolutions per transaction ->
60 transactions per second.)

(end cite)

can get better?. Note i don't say partitions or slices.


------------------------------------------------------------
Alien.org contacted...    waiting for Ripley

Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Jim C. Nasby
On Mon, Jan 09, 2006 at 06:47:04PM +0100, Eduardo wrote:
> of transactions per second.  But because each transaction requires at
> least two revolutions of the disk platter, SQLite is limited to about

Why does a transaction commit require two seperate writes?
--
Jim C. Nasby, Sr. Engineering Consultant      [hidden email]
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Dan Kennedy


--- "Jim C. Nasby" <[hidden email]> wrote:

> On Mon, Jan 09, 2006 at 06:47:04PM +0100, Eduardo wrote:
> > of transactions per second.  But because each transaction requires at
> > least two revolutions of the disk platter, SQLite is limited to about
>
> Why does a transaction commit require two seperate writes?
> --
> Jim C. Nasby, Sr. Engineering Consultant      [hidden email]
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>

The short version:

The first write operation writes the parts of the database that
are about to be overwritten to the journal file. If something
goes wrong during the second write, the journal file will be
used to restore the database to it's previous state. The second
write is the one that actually modifies the database file.





               
__________________________________________
Yahoo! DSL – Something to write home about.
Just $16.99/mo. or less.
dsl.yahoo.com

Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Jim C. Nasby
On Mon, Jan 09, 2006 at 09:08:10PM -0800, Dan Kennedy wrote:

>
>
> --- "Jim C. Nasby" <[hidden email]> wrote:
>
> > On Mon, Jan 09, 2006 at 06:47:04PM +0100, Eduardo wrote:
> > > of transactions per second.  But because each transaction requires at
> > > least two revolutions of the disk platter, SQLite is limited to about
> >
> > Why does a transaction commit require two seperate writes?
> > --
> > Jim C. Nasby, Sr. Engineering Consultant      [hidden email]
> > Pervasive Software      http://pervasive.com    work: 512-231-6117
> > vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
> >
>
> The short version:
>
> The first write operation writes the parts of the database that
> are about to be overwritten to the journal file. If something
> goes wrong during the second write, the journal file will be
> used to restore the database to it's previous state. The second
> write is the one that actually modifies the database file.

Yes, but the second write (to the data files) isn't time critical.
Because it doesn't (or at least shouldn't) be getting fsync'd very
often, it can also be re-ordered by the OS (and possibly the drive).

In any case, it's completely inaccurate to say that each transaction
requires two revolutions of the drive. Also, if SQLite supports it,
putting the log files on a seperate set of drives from the tables is
almost always a win.
--
Jim C. Nasby, Sr. Engineering Consultant      [hidden email]
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

D. Richard Hipp
In reply to this post by Shailesh N. Humbad
"Jim C. Nasby" <[hidden email]> wrote:

> On Mon, Jan 09, 2006 at 09:08:10PM -0800, Dan Kennedy wrote:
> >
> > The short version:
> >
> > The first write operation writes the parts of the database that
> > are about to be overwritten to the journal file. If something
> > goes wrong during the second write, the journal file will be
> > used to restore the database to it's previous state. The second
> > write is the one that actually modifies the database file.
>
> Yes, but the second write (to the data files) isn't time critical.
> Because it doesn't (or at least shouldn't) be getting fsync'd very
> often, it can also be re-ordered by the OS (and possibly the drive).
>
> In any case, it's completely inaccurate to say that each transaction
> requires two revolutions of the drive. Also, if SQLite supports it,
> putting the log files on a seperate set of drives from the tables is
> almost always a win.

The second sync has to occur before deleting the rollback
journal.  Otherwise a powerloss could leave you with both
an incomplete write and a missing rollback journal.

The rollback journal must always be located in the same
directory as the original database file so that SQLite
will know where to find it.  If the rollback journal is
located on a separate volume, that volume might not get
mounted after a power failure of system crash, and you'd
then be left with a corrupted database.  Besides that,
how would SQLite know where to look for the rollback
journal if it is off in some other directory someplace?
Remember, SQLite is designed to be robust and easy to
operate which means we cannot control the placement of
the rollback journal using a configuration file.

--
D. Richard Hipp <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Axel Mammes (gmail)
How about using a hard-link to store the log file somewhere else? That
should work transparently...


On 1/10/06, [hidden email] <[hidden email]> wrote:

> "Jim C. Nasby" <[hidden email]> wrote:
> > On Mon, Jan 09, 2006 at 09:08:10PM -0800, Dan Kennedy wrote:
> > >
> > > The short version:
> > >
> > > The first write operation writes the parts of the database that
> > > are about to be overwritten to the journal file. If something
> > > goes wrong during the second write, the journal file will be
> > > used to restore the database to it's previous state. The second
> > > write is the one that actually modifies the database file.
> >
> > Yes, but the second write (to the data files) isn't time critical.
> > Because it doesn't (or at least shouldn't) be getting fsync'd very
> > often, it can also be re-ordered by the OS (and possibly the drive).
> >
> > In any case, it's completely inaccurate to say that each transaction
> > requires two revolutions of the drive. Also, if SQLite supports it,
> > putting the log files on a seperate set of drives from the tables is
> > almost always a win.
>
> The second sync has to occur before deleting the rollback
> journal.  Otherwise a powerloss could leave you with both
> an incomplete write and a missing rollback journal.
>
> The rollback journal must always be located in the same
> directory as the original database file so that SQLite
> will know where to find it.  If the rollback journal is
> located on a separate volume, that volume might not get
> mounted after a power failure of system crash, and you'd
> then be left with a corrupted database.  Besides that,
> how would SQLite know where to look for the rollback
> journal if it is off in some other directory someplace?
> Remember, SQLite is designed to be robust and easy to
> operate which means we cannot control the placement of
> the rollback journal using a configuration file.
>
> --
> D. Richard Hipp <[hidden email]>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: proposal for improving concurrency in SQLite

Jim C. Nasby
On Tue, Jan 10, 2006 at 04:29:19PM -0300, Axel Mammes wrote:
> How about using a hard-link to store the log file somewhere else? That
> should work transparently...

Well, you can't hard-link across filesystems, but a symlink is exactly
what I was going to mention.

As for SQLite coming up with missing log files, I would certainly hope
it would just stop dead in it's tracks rather than corrupt anything.
--
Jim C. Nasby, Sr. Engineering Consultant      [hidden email]
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461