Blob and Text columns: what's the difference?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Blob and Text columns: what's the difference?

Tim Streater-3
What is actually the difference between a column declared as TEXT and one declared as BLOB in an SQLite database? What does SQLite do to textual data that I ask it to put into a TEXT column? How does it know not to do that if I want to send some binary data to a Text column?

The reason I'm interested is that I have a table with a TEXT column. For backwards compatibility reasons, I'd like not to change the column's type or even add another column with BLOB type. Up to now I've assumed that the data to go in the column was all ASCII or perhaps UTF-8. Now it seems that it's legal for it to be a mixture of encodings. So I want to treat it as binary and be able to just use the existing TEXT column. Anything I need to look out for?


--
Cheers  --  Tim
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Blob and Text columns: what's the difference?

Ben Asher
I seem to remember that BLOBs cannot be indexed. I can’t find documentation
on that though. Does anyone else recall the same thing and have a link, or
maybe someone can correct me?

Ben

On Tue, Aug 28, 2018 at 6:50 AM Tim Streater <[hidden email]> wrote:

> What is actually the difference between a column declared as TEXT and one
> declared as BLOB in an SQLite database? What does SQLite do to textual data
> that I ask it to put into a TEXT column? How does it know not to do that if
> I want to send some binary data to a Text column?
>
> The reason I'm interested is that I have a table with a TEXT column. For
> backwards compatibility reasons, I'd like not to change the column's type
> or even add another column with BLOB type. Up to now I've assumed that the
> data to go in the column was all ASCII or perhaps UTF-8. Now it seems that
> it's legal for it to be a mixture of encodings. So I want to treat it as
> binary and be able to just use the existing TEXT column. Anything I need to
> look out for?
>
>
> --
> Cheers  --  Tim
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
--
Ben
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Blob and Text columns: what's the difference?

J. King-3
In reply to this post by Tim Streater-3
On 2018-08-28 09:50:01, "Tim Streater" <[hidden email]> wrote:

>What is actually the difference between a column declared as TEXT and
>one declared as BLOB in an SQLite database? What does SQLite do to
>textual data that I ask it to put into a TEXT column? How does it know
>not to do that if I want to send some binary data to a Text column?
>
>
I don't know all the ins and outs, but text columnns are convertible
between UTF-8 and UTF-16 upon request, whereas blobs are just byte
strings.

--
J. King

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Blob and Text columns: what's the difference?

Simon Slavin-3
In reply to this post by Tim Streater-3
On 28 Aug 2018, at 2:50pm, Tim Streater <[hidden email]> wrote:

> What is actually the difference between a column declared as TEXT and one declared as BLOB in an SQLite database? What does SQLite do to textual data that I ask it to put into a TEXT column?

BLOB data is always handled as a block of a certain number of octets, as you'd think.

Internally, some handling of TEXT data assumes that it's UTF-8, or perhaps some other encoding:

<https://sqlite.org/pragma.html#pragma_encoding>

For instance, without testing it yourself or finding a reference, you're not entirely sure whether length() of a string proceeds past a NUL, are you ?  SQlite is written to do the most likely most convenient thing, but it is not completely predictable to a programmer who frequently plays with different encodings.

Also, if it's expecting a number, SQLite will sometimes convert a string to a number.  However, SQLite has no understanding of BLOB data and will never convert it to anything.  See section See section 4.2 of <https://www.sqlite.org/datatype3.html>.

There are also some niggles about meta-operations with BLOBs.  For instance, the "CREATE TABLE ... AS SELECT" construction handles BLOB values a little weirdly.

However, the biggest distinction between these two is not in the SQLite C API but in the countless libraries and shims written to allow languages to make SQLite calls.  String handling in those libraries can do all sorts of things to strings, including terminating them at NUL (as C would expect), parsing them as Unicode before returning a value, or internally labelling them as using a certain Windows code page.  So most of our questions resulting from unexpected string changes result not from the SQLite API but from what happens when a programming language tries to understand the string.  None of this should be done to a BLOB except under explicit programmer control.

> How does it know not to do that if I want to send some binary data to a Text column?

See section 3.4 of <https://www.sqlite.org/datatype3.html>.  Here you'll find demonstrated what happens if you put values of different types into columns of different types.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Blob and Text columns: what's the difference?

Richard Hipp-3
In reply to this post by Ben Asher
On 8/28/18, Ben Asher <[hidden email]> wrote:
> I seem to remember that BLOBs cannot be indexed. I can’t find documentation
> on that though. Does anyone else recall the same thing and have a link, or
> maybe someone can correct me?

You might be remembering the limitations of Oracle.  Other database
engines might also have this limitation too.  But not SQLite.  SQLite
allows any column to be indexed, including BLOB colums.
--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Blob and Text columns: what's the difference?

Ben Asher
Ah great to know. Thanks!

Ben

On Tue, Aug 28, 2018 at 7:29 AM Richard Hipp <[hidden email]> wrote:

> On 8/28/18, Ben Asher <[hidden email]> wrote:
> > I seem to remember that BLOBs cannot be indexed. I can’t find
> documentation
> > on that though. Does anyone else recall the same thing and have a link,
> or
> > maybe someone can correct me?
>
> You might be remembering the limitations of Oracle.  Other database
> engines might also have this limitation too.  But not SQLite.  SQLite
> allows any column to be indexed, including BLOB colums.
> --
> D. Richard Hipp
> [hidden email]
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
--
Ben
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Blob and Text columns: what's the difference?

Keith Medcalf
In reply to this post by Tim Streater-3

On Tuesday, 28 August, 2018 07:50, Tim Streater <[hidden email]> wrote:

>What is actually the difference between a column declared as TEXT and
>one declared as BLOB in an SQLite database?

Not a thing.  You are free to store data of any type in any column in any row.  The "TEXT" declaration only means that the column affinity is TEXT and if you attempt to put something that is not text in that column (such as an integer or a real) then it will be converted to text before being stored, since you have expressed that you would prefer the data in this column to be text wherever possible.

>What does SQLite do to textual data that I ask it to put into a TEXT column?

Nothing that you do not ask it to do.

>How does it know not to do that if I want to send some binary data to a Text
>column?

Simply because you do not request that those things be done.

So, the "things" that may occur depend on the APIs you use to read/write the data, and the "translations" that you might have asked SQLite3 to do.  For example, if your database is created with the 'encoding' set to one of the UTF-16 variants, and you use the sqlite3_bind_text() to put "text" in that column that is not valid UTF-8 that can be translated to UTF-16 then explosions will occur.  Similarly if the database encoding is UTF-8 (the default) and you use sqlite3_bind_text16() to try and store UTF16 data in the database, and that data is not valid UTF-16, then explosions will occur.  

If however, you database encoding is UTF-8 (the default) and you use sqlite_bind_text() to put data in the database, then SQLite3 will not "tamper" with your data and you may put whatever you wish in there whether it is valid UTF-8 or not.  Similarly if the database encoding is UTF-16 and you use sqlite3_bind_text16() you can put whatever you wish in there whether it is valid UTF-16 or not.

If you use the sqlite3_bind_blob() interface to put data in the database, it will be stored as a blob and the "affinity" (text) expressed in the create table statement will be ignored since the "affinity" specified in the table declaration will never attempt to translate a "blob" into something else.

Assuming you managed to get data into the database, then how you ask to retrieve it may also cause explosions to occur.  For example, you might have stored your binary data (that is not valid UTF-8) using sqlite3_bind_text() in the database where the database encoding is UTF-8.  If you try and retrieve it using sqlite3_column_text16() then SQLite3 will, because *YOU* requested it, attempt to translate the data, and explosions will occur.  Mutatis mutandis if *you* request other translations that are invalid (ie, from a UTF-16 encoded database where the data is not valid UTF-16, and you try to retrieve it translated to UTF-8).

However, if you attempt to retrieve the data using sqlite3_column_blob() you will always get back exactly what is in the database without translation.  If you store a blob and ask for some text, explosions may occur.

So in short, just make sure that you store your "blob" of data using the appropriate storage type (sqlite3_bind_blob) when you store it, and always check what storage type it is before retrieving it and use the corresponding sqlite3_column_(type) to get the data out, and you will be fine.  SQLite3 will do *NOTHING* to your data that you have not explicitly asked it to do.

>The reason I'm interested is that I have a table with a TEXT column.
>For backwards compatibility reasons, I'd like not to change the
>column's type or even add another column with BLOB type. Up to now
>I've assumed that the data to go in the column was all ASCII or
>perhaps UTF-8. Now it seems that it's legal for it to be a mixture of
>encodings. So I want to treat it as binary and be able to just use
>the existing TEXT column. Anything I need to look out for?

Unless you explicitly request translations to occur then none will occur.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Blob and Text columns: what's the difference?

Tim Streater-3
On 28 Aug 2018, at 15:36, Keith Medcalf <[hidden email]> wrote:

> On Tuesday, 28 August, 2018 07:50, Tim Streater <[hidden email]> wrote:

>>How does it know not to do that if I want to send some binary data to a Text
>>column?
>
> Simply because you do not request that those things be done.
>
> So, the "things" that may occur depend on the APIs you use to read/write the
> data, and the "translations" that you might have asked SQLite3 to do.  For
> example, if your database is created with the 'encoding' set to one of the
> UTF-16 variants, and you use the sqlite3_bind_text() to put "text" in that
> column that is not valid UTF-8 that can be translated to UTF-16 then
> explosions will occur.  Similarly if the database encoding is UTF-8 (the
> default) and you use sqlite3_bind_text16() to try and store UTF16 data in the
> database, and that data is not valid UTF-16, then explosions will occur.  
>
> If however, you database encoding is UTF-8 (the default) and you use
> sqlite_bind_text() to put data in the database, then SQLite3 will not "tamper"
> with your data and you may put whatever you wish in there whether it is valid
> UTF-8 or not.  Similarly if the database encoding is UTF-16 and you use
> sqlite3_bind_text16() you can put whatever you wish in there whether it is
> valid UTF-16 or not.
>
> If you use the sqlite3_bind_blob() interface to put data in the database, it
> will be stored as a blob and the "affinity" (text) expressed in the create
> table statement will be ignored since the "affinity" specified in the table
> declaration will never attempt to translate a "blob" into something else.
>
> Assuming you managed to get data into the database, then how you ask to
> retrieve it may also cause explosions to occur.  For example, you might have
> stored your binary data (that is not valid UTF-8) using sqlite3_bind_text() in
> the database where the database encoding is UTF-8.  If you try and retrieve it
> using sqlite3_column_text16() then SQLite3 will, because *YOU* requested it,
> attempt to translate the data, and explosions will occur.  Mutatis mutandis if
> *you* request other translations that are invalid (ie, from a UTF-16 encoded
> database where the data is not valid UTF-16, and you try to retrieve it
> translated to UTF-8).
>
> However, if you attempt to retrieve the data using sqlite3_column_blob() you
> will always get back exactly what is in the database without translation.  If
> you store a blob and ask for some text, explosions may occur.
>
> So in short, just make sure that you store your "blob" of data using the
> appropriate storage type (sqlite3_bind_blob) when you store it, and always
> check what storage type it is before retrieving it and use the corresponding
> sqlite3_column_(type) to get the data out, and you will be fine.  SQLite3 will
> do *NOTHING* to your data that you have not explicitly asked it to do.

Thanks. I think the above is the clue I needed. I'm not familiar with the C interface; I'm using Xojo for this application. They have an SQLiteBlob class with its own read/write methods. It should therefore suffice for me to ask on their forum what they do.


--
Cheers  --  Tim
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users