built-in functrion suggestion: size of blob

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

built-in functrion suggestion: size of blob

Lloyd Dupont
I look into the build in function of SQLite and saw there is a function to know the length of a string (in a record).
Great!

But to my disbelief there is (apparently) no way to get the size of a blob (other than loading it :-()
And no, length() doesn't work on Blob.

I think it would be a worthy addition!

Or is there already an (undocumented) such addition?
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Rob Lohman
Isn't this what you are looking for?

http://www.sqlite.org/capi3ref.html#sqlite3_column_bytes

" If the result is a BLOB then the sqlite3_column_bytes() routine returns
the number of bytes in that BLOB. "

Or do you really need it inside an SQL statement?

Rob

----- Original Message -----
From: "Lloyd Dupont" <[hidden email]>
To: <[hidden email]>
Sent: Sunday, October 23, 2005 3:15 PM
Subject: [sqlite] built-in functrion suggestion: size of blob


I look into the build in function of SQLite and saw there is a function to
know the length of a string (in a record).
Great!

But to my disbelief there is (apparently) no way to get the size of a blob
(other than loading it :-()
And no, length() doesn't work on Blob.

I think it would be a worthy addition!

Or is there already an (undocumented) such addition?

Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Lloyd Dupont
> Isn't this what you are looking for?
>
> http://www.sqlite.org/capi3ref.html#sqlite3_column_bytes
>
> " If the result is a BLOB then the sqlite3_column_bytes() routine returns
> the number of bytes in that BLOB. "
>
> Or do you really need it inside an SQL statement?
that's right!
I'm not using SQLite C API.
I'm using a .NET wrapper.
I don't see how I could could call this function in a pratical way from the
wrapper...

Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Rob Lohman
I assume the wrapper has wrapped this particular function. I'm
currently writing my own wrapper and it has wrapped it as well.

This is my definition (in case your wrapper doesn't have it):
/// <summary>
/// Returns the lengh of data in a single column of the current result row
of a query
/// </summary>
/// <param name="statementhandle">Statement handle</param>
/// <param name="column">Zero based column index. The left-most column has
an index of 0</param>
/// <returns>Column data length. If the SQL statement is not currently
pointing to a valid row, or if the the column index is out of range, the
result is undefined</returns>
[DllImport("sqlite3.dll", CallingConvention=CallingConvention.Cdecl)]
internal static extern Int32 sqlite3_column_bytes(IntPtr statementhandle,
Int32 column);

Rob

----- Original Message -----
From: "Lloyd Dupont" <[hidden email]>
To: <[hidden email]>
Sent: Sunday, October 23, 2005 3:46 PM
Subject: Re: [sqlite] built-in functrion suggestion: size of blob


>> Isn't this what you are looking for?
>>
>> http://www.sqlite.org/capi3ref.html#sqlite3_column_bytes
>>
>> " If the result is a BLOB then the sqlite3_column_bytes() routine returns
>> the number of bytes in that BLOB. "
>>
>> Or do you really need it inside an SQL statement?
> that's right!
> I'm not using SQLite C API.
> I'm using a .NET wrapper.
> I don't see how I could could call this function in a pratical way from
> the wrapper...
>

Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Kervin L. Pierre
In reply to this post by Lloyd Dupont
Lloyd Dupont wrote:
> But to my disbelief there is (apparently) no way to get the size of a blob (other than loading it :-()

I'd like to be corrected if I am wrong, but I
don't think there is anyway to do this in
SQLite, besides simply storing the size of the
blob with the blob when you write it into the
database.  SQLite stores BLOBS 'in-row' so it
has to read the entire BLOB into memory before
it figures out the size.  I believe even the
'column_bytes' function 'suffers' from this.
Tried to find out the feasibility of 'out-of-row'
BLOB in SQLite once, but I don't think there was
much interest in that.

Regards,
Kervin


Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Dennis Cote
Kervin L. Pierre wrote:

> Lloyd Dupont wrote:
>
>> But to my disbelief there is (apparently) no way to get the size of a
>> blob (other than loading it :-()
>
>
> I'd like to be corrected if I am wrong, but I
> don't think there is anyway to do this in
> SQLite, besides simply storing the size of the
> blob with the blob when you write it into the
> database.  SQLite stores BLOBS 'in-row' so it
> has to read the entire BLOB into memory before
> it figures out the size.  I believe even the
> 'column_bytes' function 'suffers' from this.
> Tried to find out the feasibility of 'out-of-row'
> BLOB in SQLite once, but I don't think there was
> much interest in that.
>
> Regards,
> Kervin
>
>
>
Kervin,

You can do the out-of-row blob storage yourself by simply storing the
blob data in a separate table and joining it to one that stores info
about the blobs, such as their size. You can then get the size first
without reading the blob into memory, then use a join to get the blob data.

create table blob_info (id integer primary key, size integer);
create table blob_data (id integer primary key references blob_info(id),
data blob);

HTH
Dennis Cote
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

developir@yahoo.com
In reply to this post by Lloyd Dupont
The built-in Sqlite length() function works just fine on blobs:

  sqlite> select length(X'0000000000');
  5

Why do you think it doesn't?

--- Lloyd Dupont <[hidden email]> wrote:

> I look into the build in function of SQLite and saw there is a function to know the length of a
> string (in a record).
> Great!
>
> But to my disbelief there is (apparently) no way to get the size of a blob (other than loading
> it :-()
> And no, length() doesn't work on Blob.
>
> I think it would be a worthy addition!
>
> Or is there already an (undocumented) such addition?



       
               
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Nuno Lucas-2
On 10/25/05, Joe Wilson <[hidden email]> wrote:
> The built-in Sqlite length() function works just fine on blobs:
>
>   sqlite> select length(X'0000000000');
>   5
>
> Why do you think it doesn't?

I remember a few months ago noticing it would count UTF-8 chars, not
bytes, so it would not return the right length for blobs in all cases
(a '\0' is a valid UTF-8 char, but try with the '(c)' [copyright] sign,
which is 2 bytes).

I don't have the code in front of me, and can be just my memory, but I
don't see that behaviour changing or would break a lot of SQL during
normal text manipulation.

Regards,
~Nuno Lucas
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

developir@yahoo.com


--- Nuno Lucas <[hidden email]> wrote:

> On 10/25/05, Joe Wilson <[hidden email]> wrote:
> > The built-in Sqlite length() function works just fine on blobs:
> >
> >   sqlite> select length(X'0000000000');
> >   5
> >
> > Why do you think it doesn't?
>
> I remember a few months ago noticing it would count UTF-8 chars, not
> bytes, so it would not return the right length for blobs in all cases
> (a '\0' is a valid UTF-8 char, but try with the '(c)' [copyright] sign,
> which is 2 bytes).
>
> I don't have the code in front of me, and can be just my memory, but I
> don't see that behaviour changing or would break a lot of SQL during
> normal text manipulation.
>
> Regards,
> ~Nuno Lucas

You are mistaken. Text and blobs are different.
TEXT is of type "SQLITE_TEXT". BLOB is of type "SQLITE_BLOB".
length() has always correctly returned the size of a blob - look at the code:

/*
** Implementation of the length() function
*/
static void lengthFunc(
  sqlite3_context *context,
  int argc,
  sqlite3_value **argv
){
  int len;
 
  assert( argc==1 );
  switch( sqlite3_value_type(argv[0]) ){
    case SQLITE_BLOB:
    case SQLITE_INTEGER:
    case SQLITE_FLOAT: {
      sqlite3_result_int(context, sqlite3_value_bytes(argv[0]));
      break;
    }
    case SQLITE_TEXT: {
      const char *z = sqlite3_value_text(argv[0]);
      for(len=0; *z; z++){ if( (0xc0&*z)!=0x80 ) len++; }
      sqlite3_result_int(context, len);
      break;
    }
    default: {
      sqlite3_result_null(context);
      break;
    }
  }
}



               
__________________________________
Start your day with Yahoo! - Make it your home page!
http://www.yahoo.com/r/hs
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Nuno Lucas-2
On 10/28/05, Joe Wilson <[hidden email]> wrote:

> --- Nuno Lucas <[hidden email]> wrote:
> > On 10/25/05, Joe Wilson <[hidden email]> wrote:
> > > The built-in Sqlite length() function works just fine on blobs:
> > >
> > >   sqlite> select length(X'0000000000');
> > >   5
> > >
> > > Why do you think it doesn't?
> >
> > I remember a few months ago noticing it would count UTF-8 chars, not
> > bytes, so it would not return the right length for blobs in all cases
> > (a '\0' is a valid UTF-8 char, but try with the '(c)' [copyright] sign,
> > which is 2 bytes).
> >
> > I don't have the code in front of me, and can be just my memory, but I
> > don't see that behaviour changing or would break a lot of SQL during
> > normal text manipulation.
> >
> > Regards,
> > ~Nuno Lucas
>
> You are mistaken. Text and blobs are different.
> TEXT is of type "SQLITE_TEXT". BLOB is of type "SQLITE_BLOB".
> length() has always correctly returned the size of a blob - look at the code:
>

lucas@ubu:~/src/sqlite$ sqlite3 test.db3
SQLite version 3.2.1
Enter ".help" for instructions
sqlite> create table x ( a TEXT, b BLOB );
sqlite> insert into x values ( 'Não', 'Não' );
sqlite> select * from x;
Não|Não
sqlite> select length(a), length(b) from x;
3|3
sqlite> .q
lucas@ubu:~/src/sqlite$ hexdump -c test.db3
0000000   S   Q   L   i   t   e       f   o   r   m   a   t       3  \0
0000010 004  \0 001 001  \0   @          \0  \0  \0 002  \0  \0  \0  \0
0000020  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0 001
0000030  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0  \0
0000040  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000060  \0  \0  \0  \0  \r  \0  \0  \0 001 003 317  \0 003 317  \0  \0
0000070  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
00003c0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0   /
00003d0 001 006 027 017 017 001   O   t   a   b   l   e   x   x 002   C
00003e0   R   E   A   T   E       T   A   B   L   E       x       (
00003f0   a       T   E   X   T   ,       b       B   L   O   B       )
0000400  \r  \0  \0  \0 001 003 363  \0 003 363  \0  \0  \0  \0  \0  \0
0000410  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
00007f0  \0  \0  \0  \v 001 003 025 025   N 303 243   o   N 303 243   o
0000800

As you see from the hexdump, "Não" is encoded as UTF-8.

It's true it works if you insert a literal BLOB, like with
x'00112233', but it only means sqlite lacks a byte counting function
for the general case.

I don't complain about it, as I only use the SQLite C API, but it
means there is no coherency if different programs are used to
manipulate the database (that could insert the data in different
ways).

Also note that I've seen a lot of sqlite wrappers/managers in the past
that don't care the encoding used (still many people think they can
live with only the first 127 ASCII chars), inserting Latin-1 or others
literally, making the length function returning the wrong value when
accented chars or other symbols - like the (c) symbol - are found. It
would be nice if we could make a quick sanity check on the DB by
comparing the byte count against the string length.

For me, it's just a low priority feature request, but I believe there
are others who may think otherwise.


Regards,
~Nuno Lucas


> /*
> ** Implementation of the length() function
> */
> static void lengthFunc(
>   sqlite3_context *context,
>   int argc,
>   sqlite3_value **argv
> ){
>   int len;
>
>   assert( argc==1 );
>   switch( sqlite3_value_type(argv[0]) ){
>     case SQLITE_BLOB:
>     case SQLITE_INTEGER:
>     case SQLITE_FLOAT: {
>       sqlite3_result_int(context, sqlite3_value_bytes(argv[0]));
>       break;
>     }
>     case SQLITE_TEXT: {
>       const char *z = sqlite3_value_text(argv[0]);
>       for(len=0; *z; z++){ if( (0xc0&*z)!=0x80 ) len++; }
>       sqlite3_result_int(context, len);
>       break;
>     }
>     default: {
>       sqlite3_result_null(context);
>       break;
>     }
>   }
> }
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

developir@yahoo.com
Length() is perfectly consistant with the value's type. As you've
demonstrated, Sqlite does not respect the column types in tables
as declared in the CREATE statement.
Length(text_value) always returns the number of characters and
length(blob_value) always returns the number of bytes. The type
of the value depends solely on its type at the time of INSERT
or UPDATE. I believe the Sqlite author calls this feature
"manifest typing".

When coding in C you can easily get the number of bytes in a TEXT
or BLOB value with appropriate manipulations of sqlite3_value_type(),
sqlite3_result_int() and sqlite3_result_text(). So no problem there.

I do not think that Sqlite SQL writers (i.e., non-C progammers) care
to know the number of bytes their objects take as long as their
assignments and comparison operations work within the framework of
the SQL language. 'Bytes' is more of a low-level C concept. But if
for some reason you absolutely must need to know the number of bytes
that a text or blob value takes from within the Sqlite SQL language,
just refer to the documentation of typeof(), CAST, CASE and quote()
and you can come up with a pure Sqlite SQL solution if you really
wanted to.


--- Nuno Lucas <[hidden email]> wrote:

> On 10/28/05, Joe Wilson <[hidden email]> wrote:
> > --- Nuno Lucas <[hidden email]> wrote:
> > > On 10/25/05, Joe Wilson <[hidden email]> wrote:
> > > > The built-in Sqlite length() function works just fine on blobs:
> > > >
> > > >   sqlite> select length(X'0000000000');
> > > >   5
> > > >
> > > > Why do you think it doesn't?
> > >
> > > I remember a few months ago noticing it would count UTF-8 chars, not
> > > bytes, so it would not return the right length for blobs in all cases
> > > (a '\0' is a valid UTF-8 char, but try with the '(c)' [copyright] sign,
> > > which is 2 bytes).
> > >
> > > I don't have the code in front of me, and can be just my memory, but I
> > > don't see that behaviour changing or would break a lot of SQL during
> > > normal text manipulation.
> > >
> > > Regards,
> > > ~Nuno Lucas
> >
> > You are mistaken. Text and blobs are different.
> > TEXT is of type "SQLITE_TEXT". BLOB is of type "SQLITE_BLOB".
> > length() has always correctly returned the size of a blob - look at the code:
> >
>
> lucas@ubu:~/src/sqlite$ sqlite3 test.db3
> SQLite version 3.2.1
> Enter ".help" for instructions
> sqlite> create table x ( a TEXT, b BLOB );
> sqlite> insert into x values ( 'N?o', 'N?o' );
> sqlite> select * from x;
> N?o|N?o
> sqlite> select length(a), length(b) from x;
> 3|3
> sqlite> .q
> lucas@ubu:~/src/sqlite$ hexdump -c test.db3
> 0000000   S   Q   L   i   t   e       f   o   r   m   a   t       3  \0
> 0000010 004  \0 001 001  \0   @          \0  \0  \0 002  \0  \0  \0  \0
> 0000020  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0 001
> 0000030  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0  \0
> 0000040  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
> *
> 0000060  \0  \0  \0  \0  \r  \0  \0  \0 001 003 317  \0 003 317  \0  \0
> 0000070  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
> *
> 00003c0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0   /
> 00003d0 001 006 027 017 017 001   O   t   a   b   l   e   x   x 002   C
> 00003e0   R   E   A   T   E       T   A   B   L   E       x       (
> 00003f0   a       T   E   X   T   ,       b       B   L   O   B       )
> 0000400  \r  \0  \0  \0 001 003 363  \0 003 363  \0  \0  \0  \0  \0  \0
> 0000410  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
> *
> 00007f0  \0  \0  \0  \v 001 003 025 025   N 303 243   o   N 303 243   o
> 0000800
>
> As you see from the hexdump, "N?o" is encoded as UTF-8.
>
> It's true it works if you insert a literal BLOB, like with
> x'00112233', but it only means sqlite lacks a byte counting function
> for the general case.
>
> I don't complain about it, as I only use the SQLite C API, but it
> means there is no coherency if different programs are used to
> manipulate the database (that could insert the data in different
> ways).
>
> Also note that I've seen a lot of sqlite wrappers/managers in the past
> that don't care the encoding used (still many people think they can
> live with only the first 127 ASCII chars), inserting Latin-1 or others
> literally, making the length function returning the wrong value when
> accented chars or other symbols - like the (c) symbol - are found. It
> would be nice if we could make a quick sanity check on the DB by
> comparing the byte count against the string length.
>
> For me, it's just a low priority feature request, but I believe there
> are others who may think otherwise.
>
>
> Regards,
> ~Nuno Lucas
>
>
> > /*
> > ** Implementation of the length() function
> > */
> > static void lengthFunc(
> >   sqlite3_context *context,
> >   int argc,
> >   sqlite3_value **argv
> > ){
> >   int len;
> >
> >   assert( argc==1 );
> >   switch( sqlite3_value_type(argv[0]) ){
> >     case SQLITE_BLOB:
> >     case SQLITE_INTEGER:
> >     case SQLITE_FLOAT: {
> >       sqlite3_result_int(context, sqlite3_value_bytes(argv[0]));
> >       break;
> >     }
> >     case SQLITE_TEXT: {
> >       const char *z = sqlite3_value_text(argv[0]);
> >       for(len=0; *z; z++){ if( (0xc0&*z)!=0x80 ) len++; }
> >       sqlite3_result_int(context, len);
> >       break;
> >     }
> >     default: {
> >       sqlite3_result_null(context);
> >       break;
> >     }
> >   }
> > }
>



       
               
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

developir@yahoo.com
Oops - the second paragraph should have read:
"with appropriate manipulations of sqlite3_value_type(),
sqlite3_value_bytes() and sqlite3_value_text()."

--- Joe Wilson <[hidden email]> wrote:

> Length() is perfectly consistant with the value's type. As you've
> demonstrated, Sqlite does not respect the column types in tables
> as declared in the CREATE statement.
> Length(text_value) always returns the number of characters and
> length(blob_value) always returns the number of bytes. The type
> of the value depends solely on its type at the time of INSERT
> or UPDATE. I believe the Sqlite author calls this feature
> "manifest typing".
>
> When coding in C you can easily get the number of bytes in a TEXT
> or BLOB value with appropriate manipulations of sqlite3_value_type(),
> sqlite3_result_int() and sqlite3_result_text(). So no problem there.
>
> I do not think that Sqlite SQL writers (i.e., non-C progammers) care
> to know the number of bytes their objects take as long as their
> assignments and comparison operations work within the framework of
> the SQL language. 'Bytes' is more of a low-level C concept. But if
> for some reason you absolutely must need to know the number of bytes
> that a text or blob value takes from within the Sqlite SQL language,
> just refer to the documentation of typeof(), CAST, CASE and quote()
> and you can come up with a pure Sqlite SQL solution if you really
> wanted to.
>



               
__________________________________
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Nuno Lucas-2
In reply to this post by developir@yahoo.com
On 10/31/05, Joe Wilson <[hidden email]> wrote:
> [...] But if
> for some reason you absolutely must need to know the number of bytes
> that a text or blob value takes from within the Sqlite SQL language,
> just refer to the documentation of typeof(), CAST, CASE and quote()
> and you can come up with a pure Sqlite SQL solution if you really
> wanted to.

Good tip! Didn't remember that as CAST is relatively recent - v3.2.3 -
and it's not in my system, yet.

Anyway that seems to take care of a generic way to obtain the byte
count of a field, so no need for a builtin function.

Now the only thing left is a way to get part of blob without having to
read the entire row into memory, but that probably would not be
easy...


Regards,
~Nuno Lucas
Reply | Threaded
Open this post in threaded view
|

Re: built-in functrion suggestion: size of blob

Kervin L. Pierre

 From my understanding of Length(), it still reads the
entire Blob into memory before getting its size ( please
correct me if I am wrong ).

Many have suggested, that the best approach is to have
your application restrict the size of the blobs that it
writes to a size that is good for you. You can then read
your blob back in chunks as needed.

Nuno Lucas wrote:
 >
 > Now the only thing left is a way to get part of blob without having to
 > read the entire row into memory, but that probably would not be
 > easy...
 >

Would be nice :)  Looks to me that blob would have to be
stored out-of-row probably, with the first 'x' bytes kept
in-row, with  'x' being a user-defineable number.

Regards,
Kervin