Good morning.
I wrote a little tool to dump certain data sets into a SQLite database. A user suggested that I should not use type TEXT but rather type VARCHAR for character columns, due to some issue with Excel/VBA. See the comments in: http://yetanothermathprogrammingconsultant.blogspot.com/2013/10/gdx2sqlite.html I could not find a reference to this using Google. So my question is: Is this a known problem? I would like to understand a little bit better what this is about. Thanks, Erwin ---------------------------------------------------------------- Erwin Kalvelagen Amsterdam Optimization Modeling Group [hidden email] http://amsterdamoptimization.com. ---------------------------------------------------------------- _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
On 2 Dec 2015, at 4:34pm, Erwin Kalvelagen <[hidden email]> wrote: > I could not find a reference to this using Google. So my question is: Is > this a known problem? I would like to understand a little bit better what > this is about. The bug, if there is a bug, must be in Excel/VBA. SQLite will accept either 'TEXT' or 'VARCHAR' and do identical things no matter which you use. This is because SQLite doesn't have a VARCHAR type, it just pretends it saw 'TEXT' instead. <https://www.sqlite.org/datatype3.html> I'm not sure what problem Excel/VBA has with it. Maybe someone else is, or maybe another list would be a better place to ask. Or perhaps the problem is with the program they used "SQLite Database Browser" which is not part of SQLite and may have its own bugs. Simon. _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
In reply to this post by erwink
On 12/2/15, Erwin Kalvelagen <[hidden email]> wrote:
> Good morning. > > I wrote a little tool to dump certain data sets into a SQLite database. A > user suggested that I should not use type TEXT but rather type VARCHAR for > character columns, due to some issue with Excel/VBA. See the comments in: > http://yetanothermathprogrammingconsultant.blogspot.com/2013/10/gdx2sqlite.html > > > I could not find a reference to this using Google. So my question is: Is > this a known problem? I would like to understand a little bit better what > this is about. > SQLite handles TEXT and VARCHAR equally well. There are no issues. Perhaps a 3rd-party GUI shell wrapper around SQLite is having problems with TEXT? -- D. Richard Hipp [hidden email] _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
In reply to this post by erwink
On Wed, Dec 2, 2015 at 2:34 PM, Erwin Kalvelagen
<[hidden email]> wrote: > A user suggested that I should not use type TEXT but rather type VARCHAR for character columns, due to some issue with Excel/VBA. If he or she turns out to be correct, do it. As Slavin and Hipp mentioned, SQLite won't care about the change. However, it is easier on the eyes (at least mine) to see TEXT used with SQLite queries. -- Bernardo Sulzbach _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
In reply to this post by erwink
On 2015/12/02 6:34 PM, Erwin Kalvelagen wrote: > Good morning. > > I wrote a little tool to dump certain data sets into a SQLite database. A > user suggested that I should not use type TEXT but rather type VARCHAR for > character columns, due to some issue with Excel/VBA. See the comments in: > http://yetanothermathprogrammingconsultant.blogspot.com/2013/10/gdx2sqlite.html > > > I could not find a reference to this using Google. So my question is: Is > this a known problem? I would like to understand a little bit better what > this is about. The reason you are not finding a reference, is because it isn't true. SQLite (as others have mentioned) is ambivalent to those types - either will do. Personally I use VARCHAR(Len) in table column definitions - simply because my schema is then directly interchangeable with MySQL/PostGres and the SQLite query planner sometimes notes that length when considering data shape - but for data purposes, SQLite doesn't care and neither do any wrappers I know of. Also, I can't imagine Excel would have an issue, whether referencing a data object via VB Script or OLE DB table import etc, I have never seen it matter in Excels 2003 through 2013. My guess is your friend probably heard of some peculiarity and then possibly misheard or misinterpreted it to be related to Varchar vs. Text. Note that in MySQL/PostGres/MSSQL/Oracle there is a very big difference. Text (and its sub-types such as mediumtext, longtext etc.) usually is stored as a kind of character blob while Varchar(n) is more like a string with a length constraint. It's perhaps also prudent to note that in SQLite, Varchar(10) is just a Text type, and won't actually limit data entries into that field to only 10 characters. Good luck! Ryan _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
> and the SQLite query planner sometimes notes that length when considering
data shape In what situations does that happen? RBS On Thu, Dec 3, 2015 at 12:29 AM, R Smith <[hidden email]> wrote: > > > On 2015/12/02 6:34 PM, Erwin Kalvelagen wrote: > >> Good morning. >> >> I wrote a little tool to dump certain data sets into a SQLite database. A >> user suggested that I should not use type TEXT but rather type VARCHAR for >> character columns, due to some issue with Excel/VBA. See the comments in: >> >> http://yetanothermathprogrammingconsultant.blogspot.com/2013/10/gdx2sqlite.html >> >> >> I could not find a reference to this using Google. So my question is: Is >> this a known problem? I would like to understand a little bit better what >> this is about. >> > > The reason you are not finding a reference, is because it isn't true. > SQLite (as others have mentioned) is ambivalent to those types - either > will do. > > Personally I use VARCHAR(Len) in table column definitions - simply because > my schema is then directly interchangeable with MySQL/PostGres and the > SQLite query planner sometimes notes that length when considering data > shape - but for data purposes, SQLite doesn't care and neither do any > wrappers I know of. Also, I can't imagine Excel would have an issue, > whether referencing a data object via VB Script or OLE DB table import etc, > I have never seen it matter in Excels 2003 through 2013. > > My guess is your friend probably heard of some peculiarity and then > possibly misheard or misinterpreted it to be related to Varchar vs. Text. > Note that in MySQL/PostGres/MSSQL/Oracle there is a very big difference. > Text (and its sub-types such as mediumtext, longtext etc.) usually is > stored as a kind of character blob while Varchar(n) is more like a string > with a length constraint. > > It's perhaps also prudent to note that in SQLite, Varchar(10) is just a > Text type, and won't actually limit data entries into that field to only 10 > characters. > > Good luck! > Ryan > > _______________________________________________ > sqlite-users mailing list > [hidden email] > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
In reply to this post by R Smith
On Wed, Dec 2, 2015 at 4:29 PM, R Smith <[hidden email]> wrote:
> > Personally I use VARCHAR(Len) in table column definitions - simply because > my schema is then directly interchangeable with MySQL/PostGres and the > SQLite query planner sometimes notes that length when considering data > shape - but for data purposes, SQLite doesn't care and neither do any > wrappers I know of. Also, I can't imagine Excel would have an issue, > whether referencing a data object via VB Script or OLE DB table import etc, > I have never seen it matter in Excels 2003 through 2013. I discourage this kind of usage because it means that in some distant future when someone has to make things work with a different database engine, they have to grind through and check every weirdo VARCHAR(73) and MEDIUMBIGINT declaration someone put in, because none of them have ever been tested with range enforcement enabled. So where someone meant "VARCHAR(256)" but actually typed "VARCHAR(6)", will your code suddenly start throwing clean errors which immediately suggest where to look, or will it just muddle through corrupting your data? There can certainly be subtle issues in any type, but my experience is that when you're trying to decode someone's code, it's easiest when the code says what is actually happening, not what it wishes were happening! Of course, if you are _currently_ writing cross-platform code, absolutely write the cross-platform code! And I will agree that the above position has some issues when faced with things such as INT being 32 bits on some platforms, which I guess would argue for using BIGINT and BIGTEXT. Unless you need VARCHAR(MAX) or LONGTEXT, but maybe MEDIUMTEXT is more portable ... and pretty soon you give up and just circle back to not decorating with unused/unchecked type annotations. -scott _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
In reply to this post by Bart Smissaert
On 12/2/15, Bart Smissaert <[hidden email]> wrote:
>> and the SQLite query planner sometimes notes that length when considering > data shape > > In what situations does that happen? > CREATE TABLE ex1(a INTEGER, b VARCHAR(5), c VARCHAR(5000)); CREATE INDEX ex1b ON ex1(b); CREATE INDEX ex1c ON ex1(c); SELECT * FROM ex1 WHERE b=?1 AND c=?2; The query planner is faced with the decision of whether to use the ex1b or ex1c index. Statistics gathered by ANALYZE would normally break this tie, but suppose ANALYZE has not been run, or suppose both indexes are equally selective. In that case, SQLite would choose ex1b since it guesses the keys would be shorter and will compare faster and the fanout will be greater, and hence extb can be searched using fewer CPU cycles. -- D. Richard Hipp [hidden email] _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
So, it will be quite a rare occurrence then that this could be of any
benefit. Still nice to know this. RBS On 3 Dec 2015 1:33 am, "Richard Hipp" <[hidden email]> wrote: > On 12/2/15, Bart Smissaert <[hidden email]> wrote: > >> and the SQLite query planner sometimes notes that length when > considering > > data shape > > > > In what situations does that happen? > > > > CREATE TABLE ex1(a INTEGER, b VARCHAR(5), c VARCHAR(5000)); > CREATE INDEX ex1b ON ex1(b); > CREATE INDEX ex1c ON ex1(c); > > SELECT * FROM ex1 WHERE b=?1 AND c=?2; > > The query planner is faced with the decision of whether to use the > ex1b or ex1c index. Statistics gathered by ANALYZE would normally > break this tie, but suppose ANALYZE has not been run, or suppose both > indexes are equally selective. In that case, SQLite would choose ex1b > since it guesses the keys would be shorter and will compare faster and > the fanout will be greater, and hence extb can be searched using fewer > CPU cycles. > > > > -- > D. Richard Hipp > [hidden email] > _______________________________________________ > sqlite-users mailing list > [hidden email] > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
In reply to this post by Scott Hess
On 2015/12/03 3:04 AM, Scott Hess wrote: > I discourage this kind of usage because it means that in some distant > future when someone has to make things work with a different database > engine, they have to grind through and check every weirdo VARCHAR(73) and > MEDIUMBIGINT declaration someone put in, because none of them have ever > been tested with range enforcement enabled. So where someone meant > "VARCHAR(256)" but actually typed "VARCHAR(6)", will your code suddenly > start throwing clean errors which immediately suggest where to look, or > will it just muddle through corrupting your data? There can certainly be > subtle issues in any type, but my experience is that when you're trying to > decode someone's code, it's easiest when the code says what is actually > happening, not what it wishes were happening! I posit that a column declared as: col VARCHAR(32) NOT NULL says a whole lot more about what is actually happening than: col TEXT NOT NULL And sure, I agree a mistyped word can be hard to decode by a future programmer, but that is a statistical probability in any case. > Of course, if you are _currently_ writing cross-platform code, absolutely > write the cross-platform code! And I will agree that the above position > has some issues when faced with things such as INT being 32 bits//.... etc. I *ALWAYS* write cross-platform code as far as SQL is concerned. I even think in this day and age every programmer should, or is there a case for the opposite? _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
On 3 Dec 2015, at 12:49pm, R Smith <[hidden email]> wrote: > I *ALWAYS* write cross-platform code as far as SQL is concerned. I even think in this day and age every programmer should, or is there a case for the opposite? If cross-platform code worked identically cross-platform I'd rest more easily. The case in question is a good example. Technically if you provide a long string for a SQL column VARCHAR(6) the SQL engine should silently truncate it to 6 characters before storing it. SQL engines which actually support VARCHAR do this correctly. SQLite doesn't. So although your code executes without error messages in both SQLite and PostgreSQL, it will do different things if your software passes along a seven character string. There are numerous other examples of this in SQL, including when constraint checking is done, the results of errors triggering ROLLBACK, how values are sorted if you put numbers into string columns, and how NULLs are handled in sorting and comparisons. There are arguments for and against what you're doing and I don't intend to take a stance. Just to keep readers aware of the problems. Simon. _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
In reply to this post by R Smith
On Thu, Dec 3, 2015 at 4:49 AM, R Smith <[hidden email]> wrote:
> On 2015/12/03 3:04 AM, Scott Hess wrote: > >> I discourage this kind of usage because it means that in some distant >> future when someone has to make things work with a different database >> engine, they have to grind through and check every weirdo VARCHAR(73) and >> MEDIUMBIGINT declaration someone put in, because none of them have ever >> been tested with range enforcement enabled. So where someone meant >> "VARCHAR(256)" but actually typed "VARCHAR(6)", will your code suddenly >> start throwing clean errors which immediately suggest where to look, or >> will it just muddle through corrupting your data? There can certainly be >> subtle issues in any type, but my experience is that when you're trying to >> decode someone's code, it's easiest when the code says what is actually >> happening, not what it wishes were happening! >> > > I posit that a column declared as: > col VARCHAR(32) NOT NULL > says a whole lot more about what is actually happening than: > col TEXT NOT NULL > I'm saying that: col TEXT NOT NULL fairly describes what is actually happening, but: col VARCHAR(32) NOT NULL implies that things are happening which are not happening. CHAR is kind of happening, in that it is character data, VAR is not happening in that it can store more data than a 1-byte prefix can describe, and 32 is definitely not happening. And sure, I agree a mistyped word can be hard to decode by a future > programmer, but that is a statistical probability in any case. > For most engines if you type VRCHAR(32) you'll get an error. In SQLite you'll get a field with TEXT affinity. If you typed VRCHR(32) SQLite will give you a field with no affinity, which will work just fine for the most part. If you develop code with VARCHAR(6) on another database, you'll notice that your new code isn't storing your 60-character strings pretty quickly, but if it's three years later and you're porting, you might _not_ notice the problem unless you have a good test suite in place. [I'd _love_ something like SQLITE_ENABLE_PEDANTIC and "PRAGMA pedantic = true" to provide an extra layer of checking on these things. When you find that someone made a mistake in shipping code, you have to decide whether to risk fixing it, or just to leave it be, and I'd rather have stronger assertions about this kind of thing up front.] Of course, if you are _currently_ writing cross-platform code, absolutely >> write the cross-platform code! And I will agree that the above position >> has some issues when faced with things such as INT being 32 bits//.... >> etc. >> > > I *ALWAYS* write cross-platform code as far as SQL is concerned. I even > think in this day and age every programmer should, or is there a case for > the opposite? My experience has always been that cross-platform code written to run on a single platform turns out to not be very cross-platform at all. Changing TEXT to VARCHAR as appropriate is the kind of thing which is generally pretty easy to do, the hard parts will be the things that nobody even realized had platform issues, like INT storing 32 bits rather than 64 bits, or subtle differences in treatment of NULL values. -scott _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
On 2015/12/03 8:49 PM, Scott Hess wrote: > On Thu, Dec 3, 2015 at 4:49 AM, R Smith <[hidden email]> wrote: > >> On 2015/12/03 3:04 AM, Scott Hess wrote: >> >> I posit that a column declared as: col VARCHAR(32) NOT NULL says a >> whole lot more about what is actually happening than: col TEXT NOT NULL > I'm saying that: > col TEXT NOT NULL > fairly describes what is actually happening, but: > col VARCHAR(32) NOT NULL > implies that things are happening which are not happening. CHAR is kind of > happening, in that it is character data, VAR is not happening in that it > can store more data than a 1-byte prefix can describe, and 32 is definitely > not happening. Ok, I see now more clear what you meant, but I don't agree. I write SQL to follow in form and function what *I MYSELF* intend for it to do - I do not adjust it to better describe what the specific SQL flavour engine actually ends up doing with the code. If I write: col VARCHAR(32) PRIMARY KEY It is only because it is my specific intent for that column to contain no more than 32 characters and have no multiple null values. Some engines may understand what I mean, some may not, but my code must always reflect what I intended. That is to me the very best future-proofing and future-understanding concept. Tomorrow, SQLite might improve on its interpretation... or add a PRAGMA strict_sql=ON mode as most of us wish for, and then how sad will you be that your entire code base contains "TEXT" everywhere where you could really have specified actual lengths the way you first intended? Not to mention... as a bonus my schema will compute correctly when moved to PostGres/MSSQL/MySQL/etc. To recode my SQL so it only says what the specific /SQL engine du jour/ can interpret because I /might/ make a spelling error and then cause myself later headaches... I don't know, that seems like a double cop-out. Besides, my OCD will never allow things like VRCHAR(6) when I intended VARCHAR(60). (And no, I'm not special, I assume that is true for 90% of the people here). Shape the code to what you INTEND. Don't make mistakes. Double check your code. I see your point, but I'm definitely sticking to my way this time. :) Cheers, Ryan _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
I understand Smith's point. But when I am sure that some schema is
going to stay in SQLite for the foreseeable future, I like using text. It is simple, adequate, and expresses my intent completely: TEXT. That's what the column has, TEXT, not 40, not 60, not less than 30, just freaking text. Lastly, reading VARCHAR(40) makes my mind expect a programmatic limit (not on the business logic) that does not exist. This tingles. Everyone that commented on this secondary issue (about using TEXT or VARCHAR for clarity of intent) has solid arguments, and any decision then is mostly a matter of taste. > Don't make mistakes. I try not to. But it happens. _______________________________________________ sqlite-users mailing list [hidden email] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users |
Free forum by Nabble | Edit this page |