storing blobs in a separate table

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

storing blobs in a separate table

Rael Bauer-3
Hi,

I am aware that in sqlite for a given "entity" one should generally
store blobs in a separate table to the other standard fields (such as
text/number etc..)

So for maximum efficiency, in such a blob table, is there a problem
storing multiple blob fields?

E.g.  data1, data2, picture, etc..

or should the blob table only have 1 blob field? (perhaps with a second
field indicating what is stored in the blob, or store different blobs in
different tables?)

Thanks

Rael

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM?] storing blobs in a separate table

Richard Damon
On 7/31/19 7:49 AM, Rael Bauer wrote:

> Hi,
>
> I am aware that in sqlite for a given "entity" one should generally
> store blobs in a separate table to the other standard fields (such as
> text/number etc..)
>
> So for maximum efficiency, in such a blob table, is there a problem
> storing multiple blob fields?
>
> E.g.  data1, data2, picture, etc..
>
> or should the blob table only have 1 blob field? (perhaps with a
> second field indicating what is stored in the blob, or store different
> blobs in different tables?)
>
> Thanks
>
> Rael
>
The Reason for making a separate Blob table is to speed up fetching
records. Blobs tend to be big, and thus they tend to reduce the number
of records that fit in a page, making searching for a record slower, and
if you need data after the blob, it needs to read through the blob to
get to it

If you have a table with two (or more) blobs, and for some accesses you
want one of them, and for other accesses you want the other, then
putting them into different tables means you don't need to read the
first blob when all you want is the second. If you most of the time want
both of them, then putting them in the same table means once it has
found the first it also has the second. Thus same or different tables
depends on how you will be using them.

Also, this suggestion is for blobs that are somewhat large, whose data
noticeably increases the size of the record, For very small blobs, it
may be better to keep them in the main table.

It is always good to know the WHY behind these rules of thumbs, so you
know how to apply them.


--
Richard Damon

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] storing blobs in a separate table

Hick Gunter
In reply to this post by Rael Bauer-3
SQLite stores rows in a compressed format that requires decoding. To access the nth field, all the fields that come before it need to be decoded. If there is a large blob stoed in a blob field, any field after that will suffer a performance penalty (unless, of course, both fields are required). "Any field" includes another blob field in the previous sentence.

So if you always want to "SELECT data1, data2, picture FROM blob_table WHERE ..." then it is ok, but if you want to "SELECT picture FROM blob_table WHERE ..." then that will be nearly just as slow as the first statement.

Consider adding a blob_type field if you need to store more than one blob per associated record.

CREATE TABLE blob_data (rec_id INTEGER, blob_type INTEGER, blob_data BLOB, PRIMARY KEY (rec_id,blob_type) ) WITHOUT ROWID;

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Rael Bauer
Gesendet: Mittwoch, 31. Juli 2019 13:49
An: [hidden email]
Betreff: [EXTERNAL] [sqlite] storing blobs in a separate table

Hi,

I am aware that in sqlite for a given "entity" one should generally store blobs in a separate table to the other standard fields (such as text/number etc..)

So for maximum efficiency, in such a blob table, is there a problem storing multiple blob fields?

E.g.  data1, data2, picture, etc..

or should the blob table only have 1 blob field? (perhaps with a second field indicating what is stored in the blob, or store different blobs in different tables?)

Thanks

Rael

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] storing blobs in a separate table

Jens Alfke-2

> On Jul 31, 2019, at 5:02 AM, Hick Gunter <[hidden email]> wrote:
>
> SQLite stores rows in a compressed format that requires decoding. To access the nth field, all the fields that come before it need to be decoded.

My understanding is that it’s just a matter of a byte-count before each field. So getting to the n’th field just requires n-1 memory reads and pointer additions, a handful of machine instructions. If so, that won’t measurably affect performance.

The increased size of the record does make a big difference, though, for the reasons discussed earlier.

—Jens


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] storing blobs in a separate table

Dominique Devienne
On Thu, Aug 1, 2019 at 5:44 PM Jens Alfke <[hidden email]> wrote:

> > On Jul 31, 2019, at 5:02 AM, Hick Gunter <[hidden email]> wrote:
> > SQLite stores rows in a compressed format that requires decoding. To
> access the nth field, all the fields that come before it need to be decoded.
>
> My understanding is that it’s just a matter of a byte-count before each
> field. So getting to the n’th field just requires n-1 memory reads and
> pointer additions, a handful of machine instructions. If so, that won’t
> measurably affect performance.
>

[DD] I think this assumes the row fits inside a page. Which with (inline)
blobs, may not be the case at all.
[DD] And Pages form a "linked-list", so reading past the blob may mean
reading ("paging") all those blob pages to follow that chain of page-ids,
to "get-past" a large blob.
[DD] There's a special mode where SQLite keeps extra pages to keep track of
pages, and thus can potentially avoid that "page-chain", but it's not often
used I believe.
[DD] But I'm no expert here :). --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] storing blobs in a separate table

David Raymond
https://www.sqlite.org/fileformat2.html
is the page with the nitty gritty for the file format.

Overflow pages are stored in a singly linked list of pages, so you have to traverse through all of the pages to get to the end. So while you may know right away that you want the 20th overflow page for example, you still have to load all 19 pages in between to find out what page that is.


[DD] There's a special mode where SQLite keeps extra pages to keep track of pages, and thus can potentially avoid that "page-chain", but it's not often used I believe.

I think you're referring to the Pointer Map pages which are used for incremental vacuum. If used, then for every single page in the database it stores the "parent" page number. Ie backwards up the btree or backwards in the overflow page list. It's basically there so that if you want to move the contents of page X to somewhere else in the file it gives you the page number Y which has the pointer to it that will need to be changed to the new page number. But it doesn't let you skip ahead in the overflow page chain.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] storing blobs in a separate table

Richard Hipp-3
On 8/1/19, David Raymond <[hidden email]> wrote:
> [autovacuum] doesn't let you skip ahead in the overflow page chain.

Actually it does.  Sort of.  Sometimes.  There is an optimization that
lets the btree layer "guess" the next page in an overflow chain and
verify its guess using the pointer-map information.  If it guesses
correctly, it can avoid reading a page from disk.  But it has to
fallback to reading the page if it guesses incorrectly.

See the 22 lines of code here:
https://sqlite.org/src/artifact/58d002f71?ln=4626-4648

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users