R*Tree performance

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

R*Tree performance

Zoltan Demeter
Hello there,

I would like to get an idea, what (insert) performance I could expect
when building a 3-dimensional (float) R*Tree.

I am using SQLite as a static lib, compiled with the following defines:
"SQLITE_ENABLE_RTREE" "SQLITE_OMIT_DEPRECATED" "SQLITE_THREADSAFE=2"
"SQLITE_MAX_EXPR_DEPTH=0"

The test I performed was on Win7 64-bit.
I am using an in-memory DB. The PC has lots of RAM.

I have 3 tables:
points consists of an implicit rowid and 3 real columns
elemenst consists of 4 integer columns, referencing a points rowid each.

Populating the above tables from an array of data goes with a speed of
approx. 1.2 M elements per second. I am happy with this. I have a
transaction wrapped around a for loop where I bind the parameters and
execute the parametrized query, inserting one row at a time.

Now, I created a table for spatial lookups:
create virtual table boundingboxes using rtree (elemID, minX, maxX,
minY, maxY, minZ, maxZ);

I have a rather poor performance of 50 k inserts per second. The data
to be inserted is precalculated and passed to the loop. I am using the
same logic as above, so the loop is wrapped in a transaction and I use
a parametrized statement.

Is this the performance I can expect, compared to more than a million
inserts into a "simple" table? Could I do something better?

thanks in advance
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: R*Tree performance

Richard Hipp-3
On 10/22/18, Zoltan Demeter <[hidden email]> wrote:
>
> I have a rather poor performance of 50 k inserts per second. The data
> to be inserted is precalculated and passed to the loop. I am using the
> same logic as above, so the loop is wrapped in a transaction and I use
> a parametrized statement.
>

Inserting into an ordinary table with indexes is basically just
appending a single record onto the end of the file.  Inserting into
RTree index is a lot more work, involving updates to multple records
scattered all across the database file.  One expects an RTree index
insert to be somewhat slower.

On the other hand, I do not recall ever spending any time trying to
optimize RTree insert, as most of our clients are more concerned with
query performance.  Have you compared the RTree insert performance in
SQLite against other systems?  Is SQLite a serious laggard here?  Do
we need to work on it?

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: R*Tree performance

Dominique Devienne
On Mon, Oct 22, 2018 at 2:03 PM Richard Hipp <[hidden email]> wrote:

> On 10/22/18, Zoltan Demeter <[hidden email]> wrote:
> > I have a rather poor performance of 50 k inserts per second. The data
> > to be inserted is precalculated and passed to the loop. I am using the
> > same logic as above, so the loop is wrapped in a transaction and I use
> > a parametrized statement.
> [...].  Have you compared the RTree insert performance in SQLite against
> other systems?
> Is SQLite a serious laggard here?  Do  we need to work on it?
>

I wrote the wiki-entry below 10 years ago (March 19, 2008 :)),
after evaluating Oracle Spatial 11g (using OCI and C++).
That was a 55x slowdown, while you report a 24x slowdown with SQLite.
I bet things have improved Oracle-side since then, but I don't know.
Still, that's only a 2x difference between the two in terms of slowdown, so
in the same ballpark I'd say. FWIW. --DD
------------------
Issue#1: Number-based
The SDO_GEOMETRY stores all coordinates with the Number SQL type (OCINumber
struct in OCI),
which is 21 bytes long and base-10 based. When binding floating point
values of type float, Oracle tends to add noise around the 6th significant
digit...

Issue#2: Slow inserts with Spatial Indexes
When an SDO_GEOMETRY column has a Spatial index associated to it (usually
the case, otherwise you can't use Spatial operator on it), this can slow
down inserts quite a bit...
For example, inserting 10,000 2D points using an OCI array-bind insert
takes only 0.2 seconds without a Spatial index, but a whopping 11 second
with one.
It appears that the single array-insert if followed by 10,000 individual
inserts into an internal Spatial table, probably later use to update the
index on commit
(which itself adds only around 1 second when processing rows 1,000 at a
time, the default, or just 0.35 second 10,000 at a time).
No news from the Spatial team since reporting this issue.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: R*Tree performance

Zoltan Demeter
In reply to this post by Zoltan Demeter
Hello there,

thanks for the responses. Indeed, I don't have anything to compare the
performance to.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users