VTable Column Affinity Question and Change Request

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

VTable Column Affinity Question and Change Request

Keith Medcalf

It seems that "column affinities" are not respected in Virtual Table implementations -- that is the value that is returned is the datatype provided by the the vtab_cursor sqlite3_result_* function and the "column affinity" from the vtab declaration is not applied.  In effect the column affinity specified in the vtab declaration seems to be ignored (or treated as none/blob) no matter what the declaration.

Somehow, I don't think this was always the case but I could be wrong.  In any case, what is the point in specifying the column affinity in the vtab declaration if it is just going to be ignored?

Example, using the current tip of trunk and the ext\misc\csv.c extension with the following input file:

a,b,c,d
1,2,3,4
2,3 or 4,4,5
3,4,5,6
4,5,6,7

SQLite version 3.32.0 2020-02-05 16:13:24
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create virtual table t using csv(filename='t.csv', header=on);
sqlite> .mode col
sqlite> .head on

sqlite> select typeof(a), a, typeof(b), b, typeof(c), c, typeof(d), d from t;

typeof(a)   a           typeof(b)   b           typeof(c)   c           typeof(d)   d
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
text        1           text        2           text        3           text        4
text        2           text        3 or 4      text        4           text        5
text        3           text        4           text        5           text        6
text        4           text        5           text        6           text        7

sqlite> pragma table_xinfo(t);

cid         name        type        aff         coll        notnull     dflt_value  pk          rowid       autoinc     hidden
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
-1                                  INTEGER                 0                       1           1           0           1
0           a           TEXT        TEXT                    0                       0           0           0           0
1           b           TEXT        TEXT                    0                       0           0           0           0
2           c           TEXT        TEXT                    0                       0           0           0           0
3           d           TEXT        TEXT                    0                       0           0           0           0

sqlite> drop table t;
sqlite> create virtual table t using csv(filename='t.csv', header=off, schema='create table t(a numeric, b numeric, c numeric, d numeric)');

sqlite> select typeof(a), a, typeof(b), b, typeof(c), c, typeof(d), d from t;

typeof(a)   a           typeof(b)   b           typeof(c)   c           typeof(d)   d
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
text        a           text        b           text        c           text        d
text        1           text        2           text        3           text        4
text        2           text        3 or 4      text        4           text        5
text        3           text        4           text        5           text        6
text        4           text        5           text        6           text        7

sqlite> pragma table_xinfo(t);

cid         name        type        aff         coll        notnull     dflt_value  pk          rowid       autoinc     hidden
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
-1                                  INTEGER                 0                       1           1           0           1
0           a           numeric     NUMERIC                 0                       0           0           0           0
1           b           numeric     NUMERIC                 0                       0           0           0           0
2           c           numeric     NUMERIC                 0                       0           0           0           0
3           d           numeric     NUMERIC                 0                       0           0           0           0

(note that the pragma table_xinfo is my slightly modified version that shows some additional information from the schema object)

If I put the rows generated by the virtual table into a similarly declared temp table, I get the expected result:

sqlite> create temporary table u(a numeric, b numeric, c numeric, d numeric);
sqlite> insert into u select * from t;

sqlite> select typeof(a), a, typeof(b), b, typeof(c), c, typeof(d), d from u;

typeof(a)   a           typeof(b)   b           typeof(c)   c           typeof(d)   d
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
text        a           text        b           text        c           text        d
integer     1           integer     2           integer     3           integer     4
integer     2           text        3 or 4      integer     4           integer     5
integer     3           integer     4           integer     5           integer     6
integer     4           integer     5           integer     6           integer     7

sqlite> pragma table_xinfo(u);

cid         name        type        aff         coll        notnull     dflt_value  pk          rowid       autoinc     hidden
----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------  ----------
-1                                  INTEGER                 0                       1           1           0           1
0           a           numeric     NUMERIC                 0                       0           0           0           0
1           b           numeric     NUMERIC                 0                       0           0           0           0
2           c           numeric     NUMERIC                 0                       0           0           0           0
3           d           numeric     NUMERIC                 0                       0           0           0           0

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.




_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: VTable Column Affinity Question and Change Request

Jens Alfke-2

> On Feb 5, 2020, at 9:58 AM, Keith Medcalf <[hidden email]> wrote:
>
> It seems that "column affinities" are not respected in Virtual Table implementations -- that is the value that is returned is the datatype provided by the the vtab_cursor sqlite3_result_* function and the "column affinity" from the vtab declaration is not applied.

The vtab implementation is responsible for generating the CREATE TABLE statement and passing it to sqlite3_declare_vtab(). It’s also responsible for returning column values. So I think the assumption is that it’s up to the implementation to be self-consistent, i.e. returning column values that match the declaration.

I haven’t used the CSV vtable. It looks as though its CREATE VIRTUAL TABLE statement takes a ‘schema’ parameter containing the SQL table declaration, that it then passes straight through to sqlite3_declare_vtab(). It probably doesn’t parse that declaration or figure out from it what the declared types of the columns are.

In other words this looks like a limitation of the CSV implementation, which is perhaps unusual in that it is not in control of the table schema it declares.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: VTable Column Affinity Question and Change Request

Keith Medcalf

On Wednesday, 5 February, 2020 18:10, Jens Alfke <[hidden email]>:

>> On Feb 5, 2020, at 9:58 AM, Keith Medcalf <[hidden email]> wrote:

>> It seems that "column affinities" are not respected in Virtual Table
>> implementations -- that is the value that is returned is the datatype
>> provided by the the vtab_cursor sqlite3_result_* function and the "column
>> affinity" from the vtab declaration is not applied.

> The vtab implementation is responsible for generating the CREATE TABLE
> statement and passing it to sqlite3_declare_vtab(). It’s also responsible
> for returning column values. So I think the assumption is that it’s up to
> the implementation to be self-consistent, i.e. returning column values
> that match the declaration.

That would make perfect sense except that the documentation for the sqlite_vtab_declare function specifically states that only the column name and type affinity are used, and that other things (ie, constraints, defaults, etc) that may be supplied in a vtab declaration are ignored.  Why go to all the bother of parsing the affinity and allowing it to be supplied if it is not used?

> I haven’t used the CSV vtable. It looks as though its CREATE VIRTUAL
> TABLE statement takes a ‘schema’ parameter containing the SQL table
> declaration, that it then passes straight through to
> sqlite3_declare_vtab(). It probably doesn’t parse that declaration or
> figure out from it what the declared types of the columns are.

Actually it does, and this is documented for the sqlite3_declare_vtab function -- and in fact the column names and affinities are parsed and stored in the internal Table schema.  The column names are used but the executed VDBE program does not "apply affinity".

>In other words this looks like a limitation of the CSV implementation,
>which is perhaps unusual in that it is not in control of the table schema
>it declares.

Yes it is, but easily fixed.  As far as I can tell it should not affect vtabs that are part of the SQLite3 distribution, but I do not know if there are third-party virtual tables that specify column affinities in the vtab declarations rather than just use blob (or no affinity).

I would expect that most vtab writers write consistent code and this is would not be an issue at all.  The CSV vtab is kind of a special case where it is returning arbitrary external data over which it has no control so the application of affinity is probably worthwhile.  In cases where it does not matter what the actual prefered value affinity is the vtab can still be declared as blob affinity (ie, no affinity applied).

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: VTable Column Affinity Question and Change Request

Jens Alfke-2


> On Feb 5, 2020, at 6:56 PM, Keith Medcalf <[hidden email]> wrote:
>
>> It probably doesn’t parse that declaration or
>> figure out from it what the declared types of the columns are.
>
> Actually it does, and this is documented for the sqlite3_declare_vtab function -- and in fact the column names and affinities are parsed and stored in the internal Table schema.  

Sorry, I meant that the _CSV extension_ doesn't parse the 'CREATE TABLE…' declaration to figure out what column affinities the caller desires.

SQLite probably stores the column affinities so they can be returned from APIs that request them; IIRC isn't there a C API call that tells you the affinity of a column?

> The column names are used but the executed VDBE program does not "apply affinity".

I suspect this is for the same reason that led to the immediate objections to your proposal: it could hurt performance.

I think your proposal makes sense given that it doesn't slow anything down if the extension didn't specify any column affinities. My virtual table doesn't.

—Jens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users