Generated columns in primary keys, indexes, and constraints

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Generated columns in primary keys, indexes, and constraints

Darren Duncan
Per https://www.sqlite.org/draft/gencol.html thank you very much for adding this
generated columns feature!

I consider this to be a very powerful foundation for a variety of useful
features, in particular because it empowers generalized solutions to several
classes of problem.

In particular, this provides a generalized foundation for arbitrary unique
constraints and indexes.

Some SQL DBMSs have complicated syntax for declaring indexes or keys, say for
example case-insensitive uniqueness or indexing only a prefix of a character
string or various other things.

When generalized generated columns are supported as SQLite is going to do, the
syntax for key or constraint declaration can simply reference a column as a
whole and use the entire pristine column value; by default this means key or
index in a case-sensitive etc manner, but if there is a generated column that is
a lowercased version of a regular text field, one can put the unique constraint
on that column instead to get the case-insensitive uniqueness without
complicating the key/index declaration syntax to do so.

There is a lot of power here from such a basic foundation.

Related to this, I strongly encourage the SQLite developers to relax the
constraint on generated columns being used in a PRIMARY KEY sooner rather than
later.

Ideally a PRIMARY KEY would have no restrictions that a UNIQUE constraint lacks;
they are conceptually the same thing, a subset of the columns of the row that
uniquely identifies the row in the table, and designating one as PRIMARY is
completely arbitrary in that sense.

The benefits I ascribed to generated columns as a foundation would be greatly
weakened if a PRIMARY KEY can't use them.

Thank you.

-- Darren Duncan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Generated columns in primary keys, indexes, and constraints

Richard Hipp-3
On 10/30/19, Darren Duncan <[hidden email]> wrote:
>
> Ideally a PRIMARY KEY would have no restrictions that a UNIQUE constraint
> lacks;
> they are conceptually the same thing, a subset of the columns of the row
> that
> uniquely identifies the row in the table, and designating one as PRIMARY is
> completely arbitrary in that sense.
>

For rowid-tables in SQLite, the PRIMARY KEY is really just a UNIQUE
constraint.  But that is kind of a bug that is retained for historical
compatibility - not something to be proud of.  Let's ignore that case
for the moment.

In a WITHOUT ROWID table, the PRIMARY KEY really is the key used by
the B-Tree algorithm to locate records on disk.  Hence, the PRIMARY
KEY really does need to be stored on disk.  Theoretically, one could
have the B-Tree algorithm itself compute the values of keys
on-the-fly.  But that is not possible in SQLite without significantly
rearchitecting the system.  Expressions are evaluated using multiple
opcodes in the bytecode engine, but B-Tree search is a single opcode.
It is unclear how the B-Tree engine could be convinced to run multiple
byte-code operations in response to each key comparison.

All that said, it wouldn't really be a problem use a STORED generated
column as a PRIMARY KEY, since the value of the STORED column is
sitting there on disk and is hence readily available to b-tree.  In
fact, that might just work if I merely remove the restriction.  But
using a VIRTUAL generated column as a PRIMARY KEY would be an issue.

--
D. Richard Hipp
[hidden email]
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Generated columns in primary keys, indexes, and constraints

Darren Duncan
On 2019-10-30 3:12 a.m., Richard Hipp wrote:

> On 10/30/19, Darren Duncan <[hidden email]> wrote:
>>
>> Ideally a PRIMARY KEY would have no restrictions that a UNIQUE constraint
>> lacks;
>> they are conceptually the same thing, a subset of the columns of the row
>> that
>> uniquely identifies the row in the table, and designating one as PRIMARY is
>> completely arbitrary in that sense.
>>
>
> For rowid-tables in SQLite, the PRIMARY KEY is really just a UNIQUE
> constraint.  But that is kind of a bug that is retained for historical
> compatibility - not something to be proud of.  Let's ignore that case
> for the moment.
>
> In a WITHOUT ROWID table, the PRIMARY KEY really is the key used by
> the B-Tree algorithm to locate records on disk.  Hence, the PRIMARY
> KEY really does need to be stored on disk.  Theoretically, one could
> have the B-Tree algorithm itself compute the values of keys
> on-the-fly.  But that is not possible in SQLite without significantly
> rearchitecting the system.  Expressions are evaluated using multiple
> opcodes in the bytecode engine, but B-Tree search is a single opcode.
> It is unclear how the B-Tree engine could be convinced to run multiple
> byte-code operations in response to each key comparison.
>
> All that said, it wouldn't really be a problem use a STORED generated
> column as a PRIMARY KEY, since the value of the STORED column is
> sitting there on disk and is hence readily available to b-tree.  In
> fact, that might just work if I merely remove the restriction.  But
> using a VIRTUAL generated column as a PRIMARY KEY would be an issue.

Thank you.

If I wasn't clear, it was specifically the STORED variant I was advocating
supporting with PRIMARY KEY, that provides all the benefits I looked for, which
I felt in theory be as easy to implement as a normal column PRIMARY KEY because
the value would be calculated before insertion.

So I request that the draft document be amended like as follows.

Take this line:

"Generated columns may not be used as part of the PRIMARY KEY. (Future versions
of SQLite might relax this constraint for STORED columns.)"

Replace with this:

"VIRTUAL generated columns may not be used as part of the PRIMARY KEY."

... with optional comment about possibly being relaxed in the future.

-- Darren Duncan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Generated columns in primary keys, indexes, and constraints

Simon Slavin-3
On 30 Oct 2019, at 6:56pm, Darren Duncan <[hidden email]> wrote:

> "Generated columns may not be used as part of the PRIMARY KEY. (Future versions of SQLite might relax this constraint for STORED columns.)"
>
> Replace with this:
>
> "VIRTUAL generated columns may not be used as part of the PRIMARY KEY."

What happens if the generated column is based on a calculation including rowid ?  At the time the values are calculated, rowid might not yet be known.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Generated columns in primary keys, indexes, and constraints

Darren Duncan
On 2019-10-30 12:02 p.m., Simon Slavin wrote:
> On 30 Oct 2019, at 6:56pm, Darren Duncan <[hidden email]> wrote:
>
>> "Generated columns may not be used as part of the PRIMARY KEY. (Future versions of SQLite might relax this constraint for STORED columns.)"
>>
>> Replace with this:
>>
>> "VIRTUAL generated columns may not be used as part of the PRIMARY KEY."
>
> What happens if the generated column is based on a calculation including rowid ?  At the time the values are calculated, rowid might not yet be known.

I suggest one option is adding a restriction that rowid may not be used directly
or indirectly in a STORED generated column and only in a VIRTUAL one.

Alternately, a longer term solution would be generate rowid as an independent
prior step.  Like explicitly calling a sequence generator in one step and then
using that as a then-constant input in composing the row.

While I can think of numerous uses of a stored generated column based on
user-supplied values, I can't think of any use case for a generated column based
on a rowid.

One way or another I believe this is surmountable and stored generated columns
in a primary key is worth it.

-- Darren Duncan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Generated columns in primary keys, indexes, and constraints

Keith Medcalf

On Wednesday, 30 October, 2019 13:23, Darren Duncan <[hidden email]> wrote:

>On 2019-10-30 12:02 p.m., Simon Slavin wrote:
>> On 30 Oct 2019, at 6:56pm, Darren Duncan <[hidden email]> wrote:

>>> "Generated columns may not be used as part of the PRIMARY KEY. (Future
>>> versions of SQLite might relax this constraint for STORED columns.)"
>>>
>>> Replace with this:
>>>
>>> "VIRTUAL generated columns may not be used as part of the PRIMARY
>>> KEY."

>> What happens if the generated column is based on a calculation
>> including rowid ?  At the time the values are calculated, rowid might not
>> yet be known.

> I suggest one option is adding a restriction that rowid may not be used
> directly or indirectly in a STORED generated column and only in a VIRTUAL one.

The rowid value is always known so this is not an issue.

> One way or another I believe this is surmountable and stored generated
> columns in a primary key is worth it.

In a rowid table, the primary key is the implicit or explicit rowid.  "primary key(...)" in such tables is merely and alternate spelling of unique(...), and you can specify an index on a generated column computed from the (explicit) rowid.  But since the rowid is already the primary key you cannot specify another one and if you do then your spelling "primary key" is automatically translated into the spelling "unique".  However, you must actually spell "primary key" as "unique" to generate a unique index on a generated column in a rowid table even if you have not explicitly declared the rowid column.

The restriction is that you cannot use a generated column as the primary key of a WITHOUT ROWID table, however, for those tables there is no rowid so something has to be the actual primary key since it cannot be the rowid.  However, in order to use a generated column as a primary key in a WITHOUT ROWID table, the value would have to be stored.  You can still create other indexes using the generated columns (whether virtual or stored) but you still need a stored primary key.  Lifting the restriction on the use of a stored generated column as the primary key for a without rowid table is the only restriction that could be lifted.

--
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Generated columns in primary keys, indexes, and constraints

Darren Duncan
On 2019-10-30 12:52 p.m., Keith Medcalf wrote:

>
> On Wednesday, 30 October, 2019 13:23, Darren Duncan <[hidden email]> wrote:
>
>> On 2019-10-30 12:02 p.m., Simon Slavin wrote:
>>> On 30 Oct 2019, at 6:56pm, Darren Duncan <[hidden email]> wrote:
>
>>>> "Generated columns may not be used as part of the PRIMARY KEY. (Future
>>>> versions of SQLite might relax this constraint for STORED columns.)"
>>>>
>>>> Replace with this:
>>>>
>>>> "VIRTUAL generated columns may not be used as part of the PRIMARY
>>>> KEY."
>
>>> What happens if the generated column is based on a calculation
>>> including rowid ?  At the time the values are calculated, rowid might not
>>> yet be known.
>
>> I suggest one option is adding a restriction that rowid may not be used
>> directly or indirectly in a STORED generated column and only in a VIRTUAL one.
>
> The rowid value is always known so this is not an issue.
>
>> One way or another I believe this is surmountable and stored generated
>> columns in a primary key is worth it.
>
> In a rowid table, the primary key is the implicit or explicit rowid.  "primary key(...)" in such tables is merely and alternate spelling of unique(...), and you can specify an index on a generated column computed from the (explicit) rowid.  But since the rowid is already the primary key you cannot specify another one and if you do then your spelling "primary key" is automatically translated into the spelling "unique".  However, you must actually spell "primary key" as "unique" to generate a unique index on a generated column in a rowid table even if you have not explicitly declared the rowid column.
>
> The restriction is that you cannot use a generated column as the primary key of a WITHOUT ROWID table, however, for those tables there is no rowid so something has to be the actual primary key since it cannot be the rowid.  However, in order to use a generated column as a primary key in a WITHOUT ROWID table, the value would have to be stored.  You can still create other indexes using the generated columns (whether virtual or stored) but you still need a stored primary key.  Lifting the restriction on the use of a stored generated column as the primary key for a without rowid table is the only restriction that could be lifted.

Okay, I think we're on the same page, so to be clear the scenario I want to have
supported is that a STORED generated column can be used in the PRIMARY KEY of a
WITHOUT ROWID table.

-- Darren Duncan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Generated columns in primary keys, indexes, and constraints

Dominique Devienne
In reply to this post by Richard Hipp-3
On Wed, Oct 30, 2019 at 11:12 AM Richard Hipp <[hidden email]> wrote:

> [...] But using a VIRTUAL generated column as a PRIMARY KEY would be an
> issue.
>

FWIW, I depend on this feature in Oracle, and it works. Both my PRIMARY and
FOREIGN keys
are VIRTUAL columns, which combine two stored columns. Oracle even allows
to ALTER TABLE
to go from a mode where PKs and FKs are scalar and using stored columns,
and another where
those PKs and FKs are switched to (still scalar) constraints but using
these VIRTUAL columns instead.
The big advantage of this is that the upgrade (one way or another)
"theoretically" writes nothing on disk
since the stored columns do not change at all, only VIRTUALs and
CONSTRAINTS are added/removed.

An alternative to this design would be to use composite PKs and FKs, but
our app depends heavily
on efficient handling of SELECT ... WHERE pk_col in (:1),  with :1 being
bound to a collection/array
of PK values, and I could not figure out a way to do the equivalent with a
composite PK. (and that's
also one reason why I've requested several times for a way to bind
collections in SQLite, in a way
that's not dependent on carray() which I'm not fond of since does not
handle lifetime)

So maybe it's not the best reason for VIRTUAL columns being allowed as PKs,
but that's my $0.02. --DD
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users