No datasize field - why?

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

No datasize field - why?

Igor Korot
Hi, ALL,

[code]
SQLite version 3.9.2 2015-11-02 18:31:45
Enter ".help" for usage hints.
sqlite> PRAGMA table_info(league);
sqlite> PRAGMA table_info(leagues);
0|id|integer|0||1
1|name|varchar(100)|0||0
2|drafttype|integer(1)|0||0
3|scoringtype|integer(1)|0||0
4|roundvalues|integer(1)|0||0
5|leaguetype|char(5)|0||0
6|salary|integer|0||0
7|benchplayers|integer(1)|0||0
sqlite>
[/code]

The field type is set to be "varchar(100)" for the name field.
Wouldn't it be more logical to have it as "varchar" and have another field
for data size?

Thank you.

P.S.: I don't know how most databases interpret this, I know ODBC does it
this way....
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Dominique Devienne
On Fri, Apr 22, 2016 at 3:09 PM, Igor Korot <[hidden email]> wrote:

> [code]
> SQLite version 3.9.2 2015-11-02 18:31:45
> Enter ".help" for usage hints.
> sqlite> PRAGMA table_info(leagues);
> 0|id|integer|0||1
> 1|name|varchar(100)|0||0
> 2|drafttype|integer(1)|0||0
> 3|scoringtype|integer(1)|0||0
> 4|roundvalues|integer(1)|0||0
> 5|leaguetype|char(5)|0||0
> sqlite>
> [/code]
>
> The field type is set to be "varchar(100)" for the name field.
> Wouldn't it be more logical to have it as "varchar" and have another field
> for data size?
>

Sometimes there's the precision too, for non-integer numbers, so that would
be two "fields" for type-related "sizes".
Given that SQLite doesn't care about sizes, and has no limits (beside
https://www.sqlite.org/limits.html), it only shows you what you specified
textually I guess.
I'd welcome a size, and precision field too, but I doubt they'd come.

But that brings up another interesting question I never thought about. Is
there a way to determine the type affinity of a column?

PRAGMA table_info obviously doesn't show it. There's typeof() for values,
but what about the actual column's affinity?
There's doc in https://www.sqlite.org/datatype3.html section 2.1, but is
there no actual "programmatic" way to get it?

Thanks, --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Simon Slavin-3
In reply to this post by Igor Korot

On 22 Apr 2016, at 2:09pm, Igor Korot <[hidden email]> wrote:

> The field type is set to be "varchar(100)" for the name field.
> Wouldn't it be more logical to have it as "varchar" and have another field
> for data size?

What you are seeing there is SQLite just repeating back the type that the CREATE TABLE command used.  It is not the type of data SQLite is actually storing.

There is no 'varchar' datatype in SQLite.  And it never truncates strings.

I think that the best thing for you may be to read this page

<https://www.sqlite.org/datatype3.html>

and get back to us if you have further questions.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Igor Korot
Hi, Simon,

On Fri, Apr 22, 2016 at 9:23 AM, Simon Slavin <[hidden email]> wrote:

>
> On 22 Apr 2016, at 2:09pm, Igor Korot <[hidden email]> wrote:
>
>> The field type is set to be "varchar(100)" for the name field.
>> Wouldn't it be more logical to have it as "varchar" and have another field
>> for data size?
>
> What you are seeing there is SQLite just repeating back the type that the CREATE TABLE command used.  It is not the type of data SQLite is actually storing.
>
> There is no 'varchar' datatype in SQLite.  And it never truncates strings.
>
> I think that the best thing for you may be to read this page
>
> <https://www.sqlite.org/datatype3.html>

I know SQLite goes by the affinity and not by the type. Something like
scripting language - nothing is typed.

However, what I'm saying is that most DBMS out there do store the size
of the field.

Even though I issue something like:

CREATE TABLE test( id INTEGER, PRIMARY KEY, test1 VARCHAR(25), test2
INTEGER(3), test3 FLOAT(25,2) );

if I ask for the columns information, I will get:

id            INTEGER
test1       VARCHAR      25      0
test2        INTEGER         3      0
test2        FLOAT           25      2

So instead of me parsing this info, it can be parsed by SQLite engine.

I understand that for the reason of backward compatibility it will not
be changed,
but it may be something to consider for the next major release (SQLite4).

Thank you.

>
> and get back to us if you have further questions.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Darren Duncan
In reply to this post by Igor Korot
The general case of a data type definition is an arbitrarily complex predicate
expression whose parts vary on the base type and other factors.  Given this, if
component details of type definitions were split out into their own table_info()
columns, you'd have a large number of columns where most are inapplicable in any
given case, eg some only apply to strings, some only to numbers, etc.  And so,
just expressing the type definition as a SQL fragment like table_info()
currently does provides a compact generic representation with all the details,
same as in CREATE TABLE. -- Darren Duncan

On 2016-04-22 6:09 AM, Igor Korot wrote:

> Hi, ALL,
>
> [code]
> SQLite version 3.9.2 2015-11-02 18:31:45
> Enter ".help" for usage hints.
> sqlite> PRAGMA table_info(league);
> sqlite> PRAGMA table_info(leagues);
> 0|id|integer|0||1
> 1|name|varchar(100)|0||0
> 2|drafttype|integer(1)|0||0
> 3|scoringtype|integer(1)|0||0
> 4|roundvalues|integer(1)|0||0
> 5|leaguetype|char(5)|0||0
> 6|salary|integer|0||0
> 7|benchplayers|integer(1)|0||0
> sqlite>
> [/code]
>
> The field type is set to be "varchar(100)" for the name field.
> Wouldn't it be more logical to have it as "varchar" and have another field
> for data size?
>
> Thank you.
>
> P.S.: I don't know how most databases interpret this, I know ODBC does it
> this way....

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Simon Slavin-3
In reply to this post by Igor Korot

On 23 Apr 2016, at 2:46am, Igor Korot <[hidden email]> wrote:

> However, what I'm saying is that most DBMS out there do store the size
> of the field.
>
> Even though I issue something like:
>
> CREATE TABLE test( id INTEGER, PRIMARY KEY, test1 VARCHAR(25), test2
> INTEGER(3), test3 FLOAT(25,2) );
>
> if I ask for the columns information, I will get:
>
> id            INTEGER
> test1       VARCHAR      25      0
> test2        INTEGER         3      0
> test2        FLOAT           25      2

My argument is that SQLite makes no use of the number of characters.  It completely ignores both the '(25)' and the '(25,2)' just like it ignores the difference between 'CHAR' and 'VARCHAR'.  Because of that there is no point in it carefully separating those numbers out and reporting them to the programmer.

In SQLite it would make more sense to report both the type specified in the table definition and how SQLite understood it.  So you might get something like

> test1       VARCHAR(25)      TEXT

but that isn't really necessary either.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Keith Medcalf
In reply to this post by Igor Korot
> > What you are seeing there is SQLite just repeating back the type that
> the CREATE TABLE command used.  It is not the type of data SQLite is
> actually storing.

> > There is no 'varchar' datatype in SQLite.  And it never truncates
> strings.

> > I think that the best thing for you may be to read this page

> > <https://www.sqlite.org/datatype3.html>
 
> I know SQLite goes by the affinity and not by the type. Something like
> scripting language - nothing is typed.

No SQLite does not "go by" the affinity or anything else, nor is "nothing is typed".  Everything is typed and is quite ducky, thank-you.  (If you do not understand, look up ducky typing).  The only enforced type is that INTEGER PRIMARY KEY (rowid) must be an integer.  Other than that, all types are quite ducky, and the affinity is merely a suggestion as to the colour to paint the duck, if possible, before storing it away.
 
> However, what I'm saying is that most DBMS out there do store the size
> of the field.

then maybe you want to use one of those?

> Even though I issue something like:
>
> CREATE TABLE test( id INTEGER, PRIMARY KEY, test1 VARCHAR(25), test2
> INTEGER(3), test3 FLOAT(25,2) );
>
> if I ask for the columns information, I will get:
>
> id            INTEGER
> test1       VARCHAR      25      0
> test2        INTEGER         3      0
> test2        FLOAT           25      2
>
> So instead of me parsing this info, it can be parsed by SQLite engine.

In SQLite you get the above?  Pray tell, how?
 
> I understand that for the reason of backward compatibility it will not
> be changed,
> but it may be something to consider for the next major release (SQLite4).

I do not think it has anything to do with backward compatibility.  It has to do with something called design.  The design of SQLite does not do the things those other DBMSes of which you speak do, because it was DESIGNED not too.  Those things that those other DBMSes do are holdovers to maintain backwards compatibility with the good old days when dinosaurs ruled the earth, 1KB of RAM was an earth-shattering amount of memory (and cost more than six dozen modern "coders" will earn in their lifetimes).  SQLite is not backwards compatible with those dinosaurs (though it will certainly run on their descendants, which now are quite feisty, run quite quickly, and have memory measured in GB or TB.  SQLite started with bitty-boxen already ruling the earth and therefore does not have to maintain the same backwards compatibility as your other DBMSes.





_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

James K. Lowden
In reply to this post by Darren Duncan
On Fri, 22 Apr 2016 19:11:46 -0700
Darren Duncan <[hidden email]> wrote:

> The general case of a data type definition is an arbitrarily complex
> predicate expression whose parts vary on the base type and other
> factors.  Given this, if component details of type definitions were
> split out into their own table_info() columns, you'd have a large
> number of columns where most are inapplicable in any given case, eg
> some only apply to strings, some only to numbers, etc.  And so, just
> expressing the type definition as a SQL fragment like table_info()
> currently does provides a compact generic representation with all the
> details, same as in CREATE TABLE.

INFORMATION_SCHEMA.  

It might not be "a compact generic representation", but it is something
CREATE TABLE is not: a tabular representation.  Something that can be
queried instead of parsed.  

--jkl
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

James K. Lowden
In reply to this post by Keith Medcalf
On Sat, 23 Apr 2016 08:56:14 -0400
"Keith Medcalf" <[hidden email]> wrote:

> Those things that those other DBMSes do are holdovers to maintain
> backwards compatibility with the good old days when dinosaurs ruled
> the earth

As amusing as your rant is, it's not accurate.  Treating columns as
types is a "holdover" from mathematics and logic.  It has nothing to do
with maintaining backwards compatibility, or the cost of RAM.  Many new
DBMS engines have been written in recent years (notably columnar
stores) and, of those that support SQL, none abandoned strict column
types.  

Granted, duck-typing is/was a design choice.  The good news is it makes
simple things simple.  But it's also the source of quite a bit of
confusion and perplexity, not to mention errors and complexity:
*inserting* is easy, but selection is made more complicated.  The
application has to be prepared to cope with inconsistencies (and
errors) in the data that are prevented by type-checking.  In SQLite,
the "dba" has to exercise care, in a very reduntant way, to effect
type-enforcement of a kind he gets "for free" with a standard DBMS.  

--jkl
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Igor Korot
In reply to this post by Keith Medcalf
Hi, Keith,

On Sat, Apr 23, 2016 at 5:56 AM, Keith Medcalf <[hidden email]> wrote:

>> > What you are seeing there is SQLite just repeating back the type that
>> the CREATE TABLE command used.  It is not the type of data SQLite is
>> actually storing.
>
>> > There is no 'varchar' datatype in SQLite.  And it never truncates
>> strings.
>
>> > I think that the best thing for you may be to read this page
>
>> > <https://www.sqlite.org/datatype3.html>
>
>> I know SQLite goes by the affinity and not by the type. Something like
>> scripting language - nothing is typed.
>
> No SQLite does not "go by" the affinity or anything else, nor is "nothing is typed".  Everything is typed and is quite ducky, thank-you.  (If you do not understand, look up ducky typing).  The only enforced type is that INTEGER PRIMARY KEY (rowid) must be an integer.  Other than that, all types are quite ducky, and the affinity is merely a suggestion as to the colour to paint the duck, if possible, before storing it away.

One good exmple will be: try to compare Pascal, C, C++, Java with
Python, Perl, Ruby, etc...
While former has strict types for all variables, the latter does not.
You don't even have to declare a variable prior to use.
(I hope you do know more languages, than just SQL. ;-) ).

This was what I was pointing to.

>
>> However, what I'm saying is that most DBMS out there do store the size
>> of the field.
>
> then maybe you want to use one of those?

You don't know anything about the software trying to writre - so how
can you propose a solution?

>
>> Even though I issue something like:
>>
>> CREATE TABLE test( id INTEGER, PRIMARY KEY, test1 VARCHAR(25), test2
>> INTEGER(3), test3 FLOAT(25,2) );
>>
>> if I ask for the columns information, I will get:
>>
>> id            INTEGER
>> test1       VARCHAR      25      0
>> test2        INTEGER         3      0
>> test2        FLOAT           25      2
>>
>> So instead of me parsing this info, it can be parsed by SQLite engine.
>
> In SQLite you get the above?  Pray tell, how?

No, that is the problem. I'm NOT getting this info.
This information is given by, e.g. MS SQL Server (just an example).

Unfortunately you removed the explanation of what is produced by SQLite. ;-)

>
>> I understand that for the reason of backward compatibility it will not
>> be changed,
>> but it may be something to consider for the next major release (SQLite4).
>
> I do not think it has anything to do with backward compatibility.  It has to do with something called design.  The design of SQLite does not do the things those other DBMSes of which you speak do, because it was DESIGNED not too.  Those things that those other DBMSes do are holdovers to maintain backwards compatibility with the good old days when dinosaurs ruled the earth, 1KB of RAM was an earth-shattering amount of memory (and cost more than six dozen modern "coders" will earn in their lifetimes).  SQLite is not backwards compatible with those dinosaurs (though it will certainly run on their descendants, which now are quite feisty, run quite quickly, and have memory measured in GB or TB.  SQLite started with bitty-boxen already ruling the earth and therefore does not have to maintain the same backwards compatibility as your other DBMSes.

[rant start]
Well, you don't have to give me all this.
I merely tried to ask why in SQLite one can't have an information
about the table field size - that is all.
Whether it is for historical reasons, to maintain backward
compatibility, because the author of SQLite designed the engine this
way - this is what I'm after. Because all those reasons will help me
better understand why SQLite differes from, e.g. ODBC.

I also want to say that as user of the engine I can ask question and
provide the suggestions. This is what the open-source is for.
Whether the suggestion is good or bad, whether it will be implemented
or not is up to the core devs/maintainters.
I also believe that you can do the same and your suggestion and
question can also be eiter approved or rejected and the question
answered in a polite and understandable way.

If people don't understand the question they come back and ask for
more info or provide a reference, like Simon.
[/rant end]

Thank you.

> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Keith Medcalf
In reply to this post by James K. Lowden

> On Sat, 23 Apr 2016 08:56:14 -0400
> "Keith Medcalf" <[hidden email]> wrote:
>
> > Those things that those other DBMSes do are holdovers to maintain
> > backwards compatibility with the good old days when dinosaurs ruled
> > the earth
 
> As amusing as your rant is, it's not accurate.  Treating columns as
> types is a "holdover" from mathematics and logic.  It has nothing to do
> with maintaining backwards compatibility, or the cost of RAM.  Many new
> DBMS engines have been written in recent years (notably columnar
> stores) and, of those that support SQL, none abandoned strict column
> types.
 
> Granted, duck-typing is/was a design choice.  The good news is it makes
> simple things simple.  But it's also the source of quite a bit of
> confusion and perplexity, not to mention errors and complexity:
> *inserting* is easy, but selection is made more complicated.  The
> application has to be prepared to cope with inconsistencies (and
> errors) in the data that are prevented by type-checking.  In SQLite,
> the "dba" has to exercise care, in a very reduntant way, to effect
> type-enforcement of a kind he gets "for free" with a standard DBMS.

The specification of "Length/precision" as well as the use of "varchar", "varbinary" and the like are holdovers from the days when files had to have fixed record lengths so BDAM would work.  It is only since the bitty-boxen (70's) that you have been able to do variable length records worth a darn, and the bitty-boxen started to treat "files" as a bag-o-bytes rather than collection-of-records.

When a file is a bag-o-bytes rather than a collection-of-records (fixed length records at that), there is no longer any need to keep the records the same length, hence no need to specify the "reserved length" of things (contrary to common mis-belief, varchar(100) does not mean a "variable length character field with a length up to 100 characters", it means a character field with a length of 100 into which a smaller amount of data may be stored that will be padded to 100 characters with special discarded-upon-retrieval padding characters" -- that is, the difference between CHAR(100) and VARCHAR(100) is that in the former case data less than 100 characters is padded with spaces, and in the latter case it is padded with nulls.  The column is still, however, 100 characters wide.  Note that there are a few DBMS which were developed after the dinosaurs died on bag-o-byte machines rather than collection-o-record machines.  For those specific DBMS, varchar may mean that the field length is v
 ariable and that the data is prefixed with the length (in pascal style), and that a "file" is therefore interpreted as a bag-o-bytes.

Note that the advent of VSAM style I/O made variable length blocks a bit more efficient, but not much.  It was not until the bag-o-bytes file came along that variable length database records became possible.

Non-duck-typing is simply a holdover of the days when everything had a type.  There was no such thing as a "variant" or an "object" and pointers had to know what it was they were pointing at.

In order for Codd and Date to conceive of a world wherein duck-typing existed, a duck-typed language would have to exist first.  Since such was inconceivable before its invention, it is no wonder that it never occurred to anyone that a database column could be ducky too.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Richard Hipp-3
In reply to this post by Igor Korot
On 4/23/16, Igor Korot <[hidden email]> wrote:
> I merely tried to ask why in SQLite one can't have an information
> about the table field size - that is all.

Answer:  That information would consume memory.

We are under constant pressure to reduce the (already small) memory
footprint of SQLite even further.  Adding new fields to the Column
object to record the size of VARCHARs and whatnot, is moving in the
wrong direction.
--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Igor Korot
On Sat, Apr 23, 2016 at 3:22 PM, Richard Hipp <[hidden email]> wrote:
> On 4/23/16, Igor Korot <[hidden email]> wrote:
>> I merely tried to ask why in SQLite one can't have an information
>> about the table field size - that is all.
>
> Answer:  That information would consume memory.

Thank you.
That was that simple. ;-)

>
> We are under constant pressure to reduce the (already small) memory
> footprint of SQLite even further.  Adding new fields to the Column
> object to record the size of VARCHARs and whatnot, is moving in the
> wrong direction.
> --
> D. Richard Hipp
> [hidden email]
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Scott Robison-2
In reply to this post by Keith Medcalf
On Sat, Apr 23, 2016 at 12:50 PM, Keith Medcalf <[hidden email]> wrote:

>
> When a file is a bag-o-bytes rather than a collection-of-records (fixed
> length records at that), there is no longer any need to keep the records
> the same length, hence no need to specify the "reserved length" of things
> (contrary to common mis-belief, varchar(100) does not mean a "variable
> length character field with a length up to 100 characters", it means a
> character field with a length of 100 into which a smaller amount of data
> may be stored that will be padded to 100 characters with special
> discarded-upon-retrieval padding characters" -- that is, the difference
> between CHAR(100) and VARCHAR(100) is that in the former case data less
> than 100 characters is padded with spaces, and in the latter case it is
> padded with nulls.  The column is still, however, 100 characters wide.


That's not necessarily true in the case of VARCHAR. It could be done that
way, but it could just as easily and probably much more efficiently be done
in a truly variable length fashion. Microsoft SQL Server (as far as I
understand, being the only other SQL engine I've spent much time with)
stores VARCHAR columns as the actual number of bytes in the field plus two
bytes of overhead. So a 100 byte string stored in a VARCHAR(10000) column
would only need 102 bytes of storage, not 10000 or 10002.

My source (because I haven't used MSSQL for a few years) is
https://msdn.microsoft.com/en-us/library/ms176089.aspx to refresh my memory.

For any SQL datastore, the way the data is stored is completely an
implementation detail. The SQL engine would be free to serialize all values
to text and store them in 4K pages if it wanted to, then deserialize them
on the way back. I certainly don't know of any that do that, but the
impetus for the creation of VARCHAR fields (I imagine) was specifically to
avoid storing padding for data that did not require it.

--
Scott Robison
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Simon Slavin-3

On 24 Apr 2016, at 12:58am, Scott Robison <[hidden email]> wrote:

> For any SQL datastore, the way the data is stored is completely an
> implementation detail. The SQL engine would be free to serialize all values
> to text and store them in 4K pages if it wanted to, then deserialize them
> on the way back. I certainly don't know of any that do that, but the
> impetus for the creation of VARCHAR fields (I imagine) was specifically to
> avoid storing padding for data that did not require it.

Speed plays a part as well as storage space.  Back in the days of mainframes and minicomputers, it was far more efficient to store fixed-length records than variable-length records.  To look up row 7463 in a file you would just multiply 7463 by the number of bytes in a row, then start reading from that byte.  Think about how much more processing and access SQLite has to do just to read a row from a database file.

So if you could make your table up of integers, floats, and text with character limits on them you could get fixed-length rows, which might reduce your access time by 60% or more.  Such a decrease in access time could mean the difference between being able to update a database live or being able to update only during an overnight run.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Darren Duncan
In reply to this post by Scott Robison-2
On 2016-04-23 4:58 PM, Scott Robison wrote:

> On Sat, Apr 23, 2016 at 12:50 PM, Keith Medcalf <[hidden email]> wrote:
>>
>> When a file is a bag-o-bytes rather than a collection-of-records (fixed
>> length records at that), there is no longer any need to keep the records
>> the same length, hence no need to specify the "reserved length" of things
>> (contrary to common mis-belief, varchar(100) does not mean a "variable
>> length character field with a length up to 100 characters", it means a
>> character field with a length of 100 into which a smaller amount of data
>> may be stored that will be padded to 100 characters with special
>> discarded-upon-retrieval padding characters" -- that is, the difference
>> between CHAR(100) and VARCHAR(100) is that in the former case data less
>> than 100 characters is padded with spaces, and in the latter case it is
>> padded with nulls.  The column is still, however, 100 characters wide.
>
> That's not necessarily true in the case of VARCHAR. It could be done that
> way, but it could just as easily and probably much more efficiently be done
> in a truly variable length fashion. Microsoft SQL Server (as far as I
> understand, being the only other SQL engine I've spent much time with)
> stores VARCHAR columns as the actual number of bytes in the field plus two
> bytes of overhead. So a 100 byte string stored in a VARCHAR(10000) column
> would only need 102 bytes of storage, not 10000 or 10002.
>
> My source (because I haven't used MSSQL for a few years) is
> https://msdn.microsoft.com/en-us/library/ms176089.aspx to refresh my memory.
>
> For any SQL datastore, the way the data is stored is completely an
> implementation detail. The SQL engine would be free to serialize all values
> to text and store them in 4K pages if it wanted to, then deserialize them
> on the way back. I certainly don't know of any that do that, but the
> impetus for the creation of VARCHAR fields (I imagine) was specifically to
> avoid storing padding for data that did not require it.

I know that in Postgres, behind the scenes, all character string types are
implemented as the variable unlimited length "text" type even when they are
declared as char(n) or varchar(n).  Therefore, there is no performance benefit
to explicitly declaring fields with a maximum length.  This also fits in with a
proper clean data model, where you just declare details significant to the
business model and not to dictate how resources are managed.  You should only
declare a length if that is actually significant to the validity of the data,
and then its functionally just like a CHECK constraint. -- Darren Duncan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Scott Robison-2
In reply to this post by Simon Slavin-3
On Apr 23, 2016 6:21 PM, "Simon Slavin" <[hidden email]> wrote:
>
>
> On 24 Apr 2016, at 12:58am, Scott Robison <[hidden email]> wrote:
>
> > For any SQL datastore, the way the data is stored is completely an
> > implementation detail. The SQL engine would be free to serialize all
values
> > to text and store them in 4K pages if it wanted to, then deserialize
them
> > on the way back. I certainly don't know of any that do that, but the
> > impetus for the creation of VARCHAR fields (I imagine) was specifically
to
> > avoid storing padding for data that did not require it.
>
> Speed plays a part as well as storage space.  Back in the days of
mainframes and minicomputers, it was far more efficient to store
fixed-length records than variable-length records.  To look up row 7463 in
a file you would just multiply 7463 by the number of bytes in a row, then
start reading from that byte.  Think about how much more processing and
access SQLite has to do just to read a row from a database file.
>
> So if you could make your table up of integers, floats, and text with
character limits on them you could get fixed-length rows, which might
reduce your access time by 60% or more.  Such a decrease in access time
could mean the difference between being able to update a database live or
being able to update only during an overnight run.

This was particularly true in the case of media like 9 track mag tape
(effective 1 dimensional access) vs modern hard drive (effective 3
dimensional access).
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Carlos Milon Silva
But, with very fast CPUs and RAM memory buffers for the directory
entries in the disks,
  the variable length records would probably result in gain for much
less I/O for the data.

On 23/04/2016 21:22, Scott Robison wrote:

> On Apr 23, 2016 6:21 PM, "Simon Slavin" <[hidden email]> wrote:
>>
>> On 24 Apr 2016, at 12:58am, Scott Robison <[hidden email]> wrote:
>>
>>> For any SQL datastore, the way the data is stored is completely an
>>> implementation detail. The SQL engine would be free to serialize all
> values
>>> to text and store them in 4K pages if it wanted to, then deserialize
> them
>>> on the way back. I certainly don't know of any that do that, but the
>>> impetus for the creation of VARCHAR fields (I imagine) was specifically
> to
>>> avoid storing padding for data that did not require it.
>> Speed plays a part as well as storage space.  Back in the days of
> mainframes and minicomputers, it was far more efficient to store
> fixed-length records than variable-length records.  To look up row 7463 in
> a file you would just multiply 7463 by the number of bytes in a row, then
> start reading from that byte.  Think about how much more processing and
> access SQLite has to do just to read a row from a database file.
>> So if you could make your table up of integers, floats, and text with
> character limits on them you could get fixed-length rows, which might
> reduce your access time by 60% or more.  Such a decrease in access time
> could mean the difference between being able to update a database live or
> being able to update only during an overnight run.
>
> This was particularly true in the case of media like 9 track mag tape
> (effective 1 dimensional access) vs modern hard drive (effective 3
> dimensional access).
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

Simon Slavin-3

On 24 Apr 2016, at 1:51pm, Carlos <[hidden email]> wrote:

> But, with very fast CPUs and RAM memory buffers for the directory entries in the disks,
> the variable length records would probably result in gain for much less I/O for the data.

Agreed.  Which is one reason why fixed-length string columns are less important and less used now.  When the bottleneck is the speed of the backing store, storing fewer characters can mean the difference between having to write one sector or two.

Another reason is that we use Unicode not ASCII/SIXBIT/EBCDIC, and in Unicode different characters take different numbers of bytes.  So even if you're storing a fixed number of bytes the convenience of always knowing exactly how many characters to display no longer exists.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: No datasize field - why?

James K. Lowden
In reply to this post by Keith Medcalf
On Sat, 23 Apr 2016 14:50:45 -0400
"Keith Medcalf" <[hidden email]> wrote:

> > On Sat, 23 Apr 2016 08:56:14 -0400
> > "Keith Medcalf" <[hidden email]> wrote:
> >
> > > Those things that those other DBMSes do are holdovers to maintain
> > > backwards compatibility with the good old days when dinosaurs
> > > ruled the earth
>  
> > As amusing as your rant is, it's not accurate.  Treating columns as
> > types is a "holdover" from mathematics and logic.  It has nothing
> > to do with maintaining backwards compatibility, or the cost of
> > RAM.  
>
> The specification of "Length/precision" as well as the use of
> "varchar", "varbinary" and the like are holdovers from the days when
> files had to have fixed record lengths so BDAM would work.  

They might have originated in that way on System R.  Ingres was
developed contemporaneously on Unix, which then and now had no
record-based file types.  As I pointed out,

> > Many new DBMS engines have been written in recent years
> > (notably columnar stores) and, of those that support SQL, none
> > abandoned strict column types.

> (contrary to common mis-belief, varchar(100) does not mean a
> "variable length character field with a length up to 100 characters",

SQL does *not* define implementation.  It defines semantics: user
provides X, system replies Y.  It makes no statement about how a column
is stored.  You know that, surely.  Why paint yourself into a corner
with an argument you know is lost before it begins?  

I guess I should remind you that length-limited character strings have
lots of utilty irrespective of storage concerns.  Some strings *are*
fixed length, for example cusip, ssn, drivers licence, employee id,
phone number.  Length-checking is a simple aspect of validation.  

There are also more banal concerns about external representation.   An
address might be limited to 60 characters so that it fits in the
billing envelope window.  Maybe 60 characters is arbitrary, but we both
know that 6000 characters will be too many.  Length limits help keep
the ruby on the rails.  

> In order for Codd and Date to conceive of a world wherein duck-typing
> existed, a duck-typed language would have to exist first.  Since such
> was inconceivable before its invention, it is no wonder that it never
> occurred to anyone that a database column could be ducky too.

You are seriously underestimating them.  You're misinterpreting the
meaning and purpose of column types.  Your rant about VSAM is, as
physicists sometimes say, not even wrong.  

If the theory seems abstruse, you also utterly ignore observed
detrimental effects of the lack of type enforcement, namely the
complexity that arises at SELECT time, when the application has to cope
with whatever detritus got parked in the database.  On a SQLite scale,
many times that's not a problem because writers are highly
constrained.  But in general it's a central concern, and was one of the
motivations for the invention of the relational model.  

--jkl
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
12