

I know that this question is not strictly related to SQLite.
I want to persist a randomaccess sequence with SQLite. A randomaccess
sequence is a queue with randomaccess to its queued elements. Suppose
that a randomaccess sequence [x_0, ..., x_n] has the following operations:
enqueue([x_0, ..., x_n], x) = [x_0, ..., x_n, x]
dequeue([x_0, ..., x_n]) = [x_1, ..., x_n]
lookup([x_0, ..., x_n], i) = x_i
update([x_0, ..., x_n], i, y) =
[x_0, ..., x_(i  1), y, x_(i + 1), ... x_n]
length([x_0, ..., x_n]) = n + 1
A naive implementation listbased would have at least one operation that
is in O(n).
A possible implementation in SQLite with all operations in O(log n)
could be:
CREATE TABLE sequence (
position INTEGER PRIMARY KEY AUTOINCREMENT,
...
);
 enqueue
INSERT INTO sequence VALUES (...);
 dequeue
SELECT position, ... FROM sequence ORDER BY position ASC LIMIT 1;
DELETE FROM sequence WHERE position = :position;
 lookup
SELECT position AS start, ... FROM sequence
ORDER BY position ASC LIMIT 1;
SELECT ... FROM sequence WHERE position = :start + :i;
 update
UPDATE sequence SET ... WHERE position = :i;
 length
SELECT position AS start, ... FROM sequence
ORDER BY position ASC LIMIT 1;
SELECT position AS end, ... FROM sequence
ORDER BY position ASC LIMIT 1;
SELECT :end  :start + 1;
Unfortunately, this limits the maximum number of elements that can ever
be inserted during a table's lifetime to 2^63  1. While this might be
acceptable in some cases it is an artificial limitation.
It seems that one way that is not subject to to such limitations is to
abuse tables as lists and use algorithms for purely functional data
structures. There is an algorithm that could be used to implement
randomaccess sequences in SQLite efficiently [1] but it's neither
elegant nor simple.
Does anyone have another idea how to elegantly implement randomaccess
sequences in O(log n) in SQLite?
 MatthiasChristian
[1] Robert Hood and Robert Melville. Realtime queue operations in pure
LISP
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


On Tue, Mar 1, 2016 at 12:59 PM, MatthiasChristian Ott < [hidden email]>
wrote:
> Unfortunately, this limits the maximum number of elements that can ever
> be inserted during a table's lifetime to 2^63  1. While this might be
> acceptable in some cases it is an artificial limitation.
>
Artificial, yes, but so is "64 bits." You will likely hit other limitations
far before getting anywhere near 2^631 insertions:
https://www.sqlite.org/limits.htmle.g. point #13:
*Maximum Number Of Rows In A Table*
The theoretical maximum number of rows in a table is 264 (18446744073709551616
or about 1.8e+19). This limit is unreachable since the maximum database
size of 140 terabytes will be reached first. A 140 terabytes database can
hold no more than approximately 1e+13 rows, and then only if there are no
indices and if each row contains very little data.

 stephan beal
http://wanderinghorse.net/home/stephan/http://gplus.to/sgbeal"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do."  Bigby Wolf
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


In reply to this post by MatthiasChristian Ott
 Original message 
From: Stephan Beal < [hidden email]>
Date: 01/03/2016 12:07 (GMT+00:00)
To: SQLite mailing list < [hidden email]>
Subject: Re: [sqlite] Randomaccess sequences
> On Tue, Mar 1, 2016 at 12:59 PM, MatthiasChristian Ott < [hidden email]>
wrote:
>> Unfortunately, this limits the maximum number of elements that can ever
>> be inserted during a table's lifetime to 2^63  1. While this might be
>> acceptable in some cases it is an artificial limitation.
>>
>Artificial, yes, but so is "64 bits." You will likely hit other limitations
far before getting anywhere near 2^631 insertions:
> https://www.sqlite.org/limits.html> e.g. point #13:
> *Maximum Number Of Rows In A Table*
I don't think he's bothered about the maximum number of rows at one time, but that he might run out of new rowids. However, this feels as needless a concern: with 1,000 queue/dequeues per second, 2^63 IDs will last 292 million years...
Graham.
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


On 01/03/16 12:07, Stephan Beal wrote:
> On Tue, Mar 1, 2016 at 12:59 PM, MatthiasChristian Ott < [hidden email]>
> wrote:
>
>> Unfortunately, this limits the maximum number of elements that can ever
>> be inserted during a table's lifetime to 2^63  1. While this might be
>> acceptable in some cases it is an artificial limitation.
>>
>
> Artificial, yes, but so is "64 bits." You will likely hit other limitations
> far before getting anywhere near 2^631 insertions:
>
> https://www.sqlite.org/limits.html>
> e.g. point #13:
>
> *Maximum Number Of Rows In A Table*
>
> The theoretical maximum number of rows in a table is 264 (18446744073709551616
> or about 1.8e+19). This limit is unreachable since the maximum database
> size of 140 terabytes will be reached first. A 140 terabytes database can
> hold no more than approximately 1e+13 rows, and then only if there are no
> indices and if each row contains very little data.
I use SQLite as a buffer for small amounts of data so I'm not near the
limits of the database. However, 2^63  1 for autoincrement columns
refers to the lifetime of the table not the limits of the database. I'm
aware that pushing 2^63  1 elements through the queue is not realistic
even in the largest deployments. But saying that your software breaks
after 2^63  1 elements feels somewhat like saying that your name can
only be 20 character because some mainframe software can't handle more
than that. It's a feeling that you have failed as a software developer
to deliver a proper solution and now pretend that your improper solution
is proper because nobody will hit the limits.
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


On 3/1/16 7:07 AM, Stephan Beal wrote:
> On Tue, Mar 1, 2016 at 12:59 PM, MatthiasChristian Ott < [hidden email]>
> wrote:
>
>> Unfortunately, this limits the maximum number of elements that can ever
>> be inserted during a table's lifetime to 2^63  1. While this might be
>> acceptable in some cases it is an artificial limitation.
>>
> Artificial, yes, but so is "64 bits." You will likely hit other limitations
> far before getting anywhere near 2^631 insertions:
>
> https://www.sqlite.org/limits.html>
> e.g. point #13:
>
> *Maximum Number Of Rows In A Table*
>
> The theoretical maximum number of rows in a table is 264 (18446744073709551616
> or about 1.8e+19). This limit is unreachable since the maximum database
> size of 140 terabytes will be reached first. A 140 terabytes database can
> hold no more than approximately 1e+13 rows, and then only if there are no
> indices and if each row contains very little data.
>
You can hit 2^63 insertions well before hitting the size limit of the
database if you have also been doing deletions.
A simple repeated pattern of Add record N, Remove record N1 will
eventually us up the limits on record number.
The answer here is to detect when the record number is getting near to
'too big' and then go through an 'renumber' the record back in sequence
from 1 and reset the next record number. This might require changing
data elsewhere if record numbers have been saved anywhere, which means
it might make sense to not wait till overflow in imminent, but do it
proactively when minimal records (ideally no records) are referenced
elsewhere (and only in known locations).
2^63 is such a big number that it is likely you can find a time for
standard PM when you can do this with the database offline before things
overflow. At one insertion a nanosecond, you have almost 3 centuries
before this needs to be done (if I am doing my math right), so you can
do it when you migrate to a new box.
With 32 bit numbers, these limits were readily reachable, 64 bits was a
quantum leap in range, adding 32 doubling to the limit, which will take
a while to catch up (which we likely will). The 140 terabyte limit is
something that I could see being hit now in some applications.

Richard Damon
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


In reply to this post by MatthiasChristian Ott
On 01/03/16 11:59, MatthiasChristian Ott wrote:
> I know that this question is not strictly related to SQLite.
>
> I want to persist a randomaccess sequence with SQLite. A randomaccess
> sequence is a queue with randomaccess to its queued elements. Suppose
> that a randomaccess sequence [x_0, ..., x_n] has the following operations:
>
> enqueue([x_0, ..., x_n], x) = [x_0, ..., x_n, x]
> dequeue([x_0, ..., x_n]) = [x_1, ..., x_n]
> lookup([x_0, ..., x_n], i) = x_i
> update([x_0, ..., x_n], i, y) =
> [x_0, ..., x_(i  1), y, x_(i + 1), ... x_n]
> length([x_0, ..., x_n]) = n + 1
Thinking a bit more about it, I also need the following operation:
delete([x_0, ..., x_n], i) = [x_0, ..., x_(i  1), x_(i + 1), ... x_n]
So implementing the data structure with autoincrement does not work anymore.
 MatthiasChristian
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


In RAM, the simple implementation would be to have a balanced tree, ordered by index, where every node knows how many elements are below it, and has pointers to its children. The tree will have O(logN) depth, following a pointer is O(1), and, and all of your operations involve a small (constantbound) number of nodes at each level.
In a typical indexed database, following a pointer (finding the row for a key) takes O(logN) time (rather than O(1) time), and it seems most of your operations would now cost O((logN)^2) time.
You could hack your database's Btree implementation, so that each BTree page "knows" the number of records "below" the page. At that point you would essentially be putting the RAM implementation into a file (with an optimistic assumption that Read(file, page#) is a constanttime operation in the file system).
In either of the above, autoincrement works fine as long as the total number enqueued() over the lifetime of the database is not too large, but in addition to the autoincrement, you need to update the "parents" of inserted or removed records.
Regards
Original Message
From: MatthiasChristian Ott [mailto: [hidden email]]
Sent: Tuesday, March 01, 2016 10:38 AM
To: [hidden email]
Subject: Re: [sqlite] Randomaccess sequences
On 01/03/16 11:59, MatthiasChristian Ott wrote:
> I know that this question is not strictly related to SQLite.
>
> I want to persist a randomaccess sequence with SQLite. A
> randomaccess sequence is a queue with randomaccess to its queued
> elements. Suppose that a randomaccess sequence [x_0, ..., x_n] has the following operations:
>
> enqueue([x_0, ..., x_n], x) = [x_0, ..., x_n, x] dequeue([x_0, ...,
> x_n]) = [x_1, ..., x_n] lookup([x_0, ..., x_n], i) = x_i update([x_0,
> ..., x_n], i, y) =
> [x_0, ..., x_(i  1), y, x_(i + 1), ... x_n] length([x_0, ..., x_n])
> = n + 1
Thinking a bit more about it, I also need the following operation:
delete([x_0, ..., x_n], i) = [x_0, ..., x_(i  1), x_(i + 1), ... x_n]
So implementing the data structure with autoincrement does not work anymore.
 MatthiasChristian
**************************************************************************************
This email and any attachments thereto may contain confidential information and/or information protected by intellectual property rights for the exclusive attention of the intended addressees named above. If you have received this transmission in error, please immediately notify the sender by return email and delete this message and its attachments. Unauthorized use, copying or further full or partial distribution of this email or its contents is prohibited.
**************************************************************************************
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


On Tue, 1 Mar 2016 08:15:25 0500
Richard Damon < [hidden email]> wrote:
> > The theoretical maximum number of rows in a table is 264
> > (18446744073709551616 or about 1.8e+19). This limit is unreachable
> > since the maximum database size of 140 terabytes will be reached
> > first. A 140 terabytes database can hold no more than approximately
> > 1e+13 rows, and then only if there are no indices and if each row
> > contains very little data.
> >
> You can hit 2^63 insertions well before hitting the size limit of the
> database if you have also been doing deletions.
Yes. If you manage 1,000,000 insertion/second, that's 3.15576 *
10^13/year. You would run out of integers in 584,542 years.
To get around that, add an "epoch" column, also integer.
Initially it is always zero. Whenever "position" exceeds 2^63,
increment "epoch" and reset "position" to zero.
That will give you at least twice as many years.
jkl
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers


On 2016/03/02 2:26 AM, James K. Lowden wrote:
> On Tue, 1 Mar 2016 08:15:25 0500
> Richard Damon < [hidden email]> wrote:
>
>>> The theoretical maximum number of rows in a table is 264
>>> (18446744073709551616 or about 1.8e+19). This limit is unreachable
>>> since the maximum database size of 140 terabytes will be reached
>>> first. A 140 terabytes database can hold no more than approximately
>>> 1e+13 rows, and then only if there are no indices and if each row
>>> contains very little data.
>>>
>> You can hit 2^63 insertions well before hitting the size limit of the
>> database if you have also been doing deletions.
> Yes. If you manage 1,000,000 insertion/second, that's 3.15576 *
> 10^13/year. You would run out of integers in 584,542 years.
>
> To get around that, add an "epoch" column, also integer.
> Initially it is always zero. Whenever "position" exceeds 2^63,
> increment "epoch" and reset "position" to zero.
>
> That will give you at least twice as many years.
No, that will give you 2^63 as many years... Which is many many times
more than the age of the Universe and slightly less entries than there
are atoms in the known Universe (About 10^80 on last count). Good idea,
I think it will be enough. :)
_______________________________________________
sqliteusers mailing list
[hidden email]
http://mailinglists.sqlite.org/cgibin/mailman/listinfo/sqliteusers

