Non-unique columns in unique index

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Non-unique columns in unique index

Deon Brewis
Is there way to add non-unique columns in a unique index?

I would like to use the same index to enforce unique constraints, as well as giving a covered result for other queries.

Something like an 'INCLUDE' would also work (actually even better). E.g.

CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)


If not, is there a way to efficiently implement a UNIQUE constraint in a different way? (Trigger maybe?)

-Deon
 
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Simon Slavin-3

On 1 Mar 2017, at 5:00pm, Deon Brewis <[hidden email]> wrote:

> Is there way to add non-unique columns in a unique index?

I don’t know of anything that operates in that way.

> If not, is there a way to efficiently implement a UNIQUE constraint in a different way? (Trigger maybe?)

Not efficiently.  You could make a non-UNIQUE index which has the columns you want UNIQUE at the beginning.  Then you could include a check for uniqueness in your own code.  This would probably tempt SQLite to use that index when checking uniqueness.  But this strikes me as less efficient than just letting SQLite do its own thing.

But it would be far more efficient just to create two indexes, one for the UNIQUE and one for coverage.  Or just create the UNIQUE index and let SQLite get the other columns itself.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Clemens Ladisch
In reply to this post by Deon Brewis
Deon Brewis wrote:
> Is there way to add non-unique columns in a unique index?
>
> I would like to use the same index to enforce unique constraints, as well as giving a covered result for other queries.
>
> CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)

CREATE TABLE Foo (
  UniqueCol PRIMARY KEY,
  ExtraCol,
  [...]
) WITHOUT ROWID;


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Deon Brewis
Thanks, thought about it. But I have a ROWID requirement as well due to foreign keys referring back to this table via the ROWID.

I could in theory put that just the RowId in an additional index, but then the RowId index will just carry the entire UniqueCol again - back to the same problem, just reversed.

To throw the specific design out there:

This is a frequent read, infrequent update table:

CREATE TABLE Blah(
    Id Integer Primary Key,
    UniqueCol blob, // 20 bytes fixed
    ExtraCol blob, // 12 bytes fixed
    UNIQUE(UniqueCol)
);  // ~36 bytes/row

CREATE INDEX sqlite_autoindex_Resource_1 on Blah  ( // implicit index
    UniqueCol,   // 20 bytes
     // Id -- implicit
)  // ~24 bytes/row

CREATE INDEX blahIndex on Blah ( // actual needed index for workload
    UniqueCol, // 20 bytes
    ExtraCol,  // 12 bytes
     // Id -- implicit
)  // ~36 bytes/row

So this 3rd index is exactly as big as the original table. I would love to get rid of at least the 2 vs. 3 redundancy here.

- Deon

-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Clemens Ladisch
Sent: Wednesday, March 1, 2017 9:58 AM
To: [hidden email]
Subject: Re: [sqlite] Non-unique columns in unique index

Deon Brewis wrote:
> Is there way to add non-unique columns in a unique index?
>
> I would like to use the same index to enforce unique constraints, as well as giving a covered result for other queries.
>
> CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)

CREATE TABLE Foo (
  UniqueCol PRIMARY KEY,
  ExtraCol,
  [...]
) WITHOUT ROWID;


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

James K. Lowden
In reply to this post by Deon Brewis
On Wed, 1 Mar 2017 17:00:16 +0000
Deon Brewis <[hidden email]> wrote:

> If not, is there a way to efficiently implement a UNIQUE constraint
> in a different way? (Trigger maybe?)

I'll defer to others about how SQLite will respond, but I'd be tempted
to do this:

        create table Foo(UniqueCol primary key, ExtraCol, ... );
        create index i_cover_foo on Foo(UniqueCol, ExtraCol);

I expect how useful/expensive that is would depend on cardinality.  

--jkl


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

David Raymond
In reply to this post by Deon Brewis
The trigger version you asked about would look something like the below I believe. More risky than having the two indexes, but should work. (Famous last words)

create table Blah
(
  Id integer primary key,
  UniqueCol blob,
  ExtraCol blob
);

create index blahIndex on Blah (UniqueCol, ExtraCol);

create trigger trg_Blah_UniqueCol_insert
before insert on Blah
for each row
when exists (select 1 from Blah where UniqueCol = new.UniqueCol)
begin select raise(abort, 'UNIQUE constraint failed: Blah.UniqueCol'); end;

create trigger trg_Blah_UniqueCol_update
before update of UniqueCol on Blah
for each row
when exists (select 1 from Blah where UniqueCol = new.UniqueCol)
begin select raise(abort, 'UNIQUE constraint failed: Blah.UniqueCol'); end;


-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Deon Brewis
Sent: Wednesday, March 01, 2017 1:34 PM
To: SQLite mailing list
Subject: Re: [sqlite] Non-unique columns in unique index

Thanks, thought about it. But I have a ROWID requirement as well due to foreign keys referring back to this table via the ROWID.

I could in theory put that just the RowId in an additional index, but then the RowId index will just carry the entire UniqueCol again - back to the same problem, just reversed.

To throw the specific design out there:

This is a frequent read, infrequent update table:

CREATE TABLE Blah(
    Id Integer Primary Key,
    UniqueCol blob, // 20 bytes fixed
    ExtraCol blob, // 12 bytes fixed
    UNIQUE(UniqueCol)
);  // ~36 bytes/row

CREATE INDEX sqlite_autoindex_Resource_1 on Blah  ( // implicit index
    UniqueCol,   // 20 bytes
     // Id -- implicit
)  // ~24 bytes/row

CREATE INDEX blahIndex on Blah ( // actual needed index for workload
    UniqueCol, // 20 bytes
    ExtraCol,  // 12 bytes
     // Id -- implicit
)  // ~36 bytes/row

So this 3rd index is exactly as big as the original table. I would love to get rid of at least the 2 vs. 3 redundancy here.

- Deon

-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Clemens Ladisch
Sent: Wednesday, March 1, 2017 9:58 AM
To: [hidden email]
Subject: Re: [sqlite] Non-unique columns in unique index

Deon Brewis wrote:
> Is there way to add non-unique columns in a unique index?
>
> I would like to use the same index to enforce unique constraints, as well as giving a covered result for other queries.
>
> CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)

CREATE TABLE Foo (
  UniqueCol PRIMARY KEY,
  ExtraCol,
  [...]
) WITHOUT ROWID;


Regards,
Clemens
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Simon Slavin-3

On 1 Mar 2017, at 8:21pm, David Raymond <[hidden email]> wrote:

> The trigger version you asked about would look something like the below I believe. More risky than having the two indexes, but should work. (Famous last words)

I have no doubt it would work.  But look how much space and processing time it would take up.  Far simpler and clearer just to create the two indexes.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Deon Brewis
"But look how much space and processing time it would take up"

Can you clarify what you mean by "space" ?

The processing time argument I understand.


I think this is one of those things that if the database engine doesn't internally support it, it can't really be emulated.

Sure would be nice to have INCLUDE columns support (here and in other places).

- Deon

-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Simon Slavin
Sent: Wednesday, March 1, 2017 12:57 PM
To: SQLite mailing list <[hidden email]>
Subject: Re: [sqlite] Non-unique columns in unique index


On 1 Mar 2017, at 8:21pm, David Raymond <[hidden email]> wrote:

> The trigger version you asked about would look something like the below I believe. More risky than having the two indexes, but should work. (Famous last words)

I have no doubt it would work.  But look how much space and processing time it would take up.  Far simpler and clearer just to create the two indexes.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Simon Slavin-3

On 1 Mar 2017, at 9:11pm, Deon Brewis <[hidden email]> wrote:

> "But look how much space and processing time it would take up"
>
> Can you clarify what you mean by "space" ?

Your triggers have to be stored.  Every time they’re activated (each time you insert a row) they have to be run.  That requires SQLite to grab enough memory and/or disk space to run the triggers and to construct any temporary indexes they require.

Also, it’s a nightmare to debug in comparison with just creating one extra index.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Deon Brewis
Yeah ok, but that is paltry compared with the gb's of diskspace that the actual second index takes up. But thanks for clarifying.

-Deon

-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Simon Slavin
Sent: Wednesday, March 1, 2017 1:38 PM
To: SQLite mailing list <[hidden email]>
Subject: Re: [sqlite] Non-unique columns in unique index


On 1 Mar 2017, at 9:11pm, Deon Brewis <[hidden email]> wrote:

> "But look how much space and processing time it would take up"
>
> Can you clarify what you mean by "space" ?

Your triggers have to be stored.  Every time they’re activated (each time you insert a row) they have to be run.  That requires SQLite to grab enough memory and/or disk space to run the triggers and to construct any temporary indexes they require.

Also, it’s a nightmare to debug in comparison with just creating one extra index.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Simon Slavin-3

On 1 Mar 2017, at 9:41pm, Deon Brewis <[hidden email]> wrote:

> Yeah ok, but that is paltry compared with the gb's of diskspace that the actual second index takes up. But thanks for clarifying.

Ah.  If it’s really GBs of disk space then I can see why you’d look for alternative solutions.

But I have a 43 GB database file which could be 20 GB without an extra index.  I could have written that extra check in code, and reduced the file size, but I decided not to.  Because once I had developed procedures to handle a 20 GB file, I might was well be dealing with a 43 GB file anyway.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Paul Sanderson
As a bit of an off the wall suggestion you could try an MD5 (or even a
partial MD5 - half of the bytes)

CREATE table hashes (
    hash integer primary key; // just the first 64 bits of the hash of
uniquecol and extracol
)

as an integer primary key the hash would be an alias of the rowid and so
storage would be 8 bytes plus admin

the chance of a random colliison based on a 64 bit hash would be (I think)
1:9,223,372,036,854,775,808

MD5 is broken but would work OK for this

use a trigger to abort the insert into blah if the insert into hashes fails.


Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 1 March 2017 at 21:54, Simon Slavin <[hidden email]> wrote:

>
> On 1 Mar 2017, at 9:41pm, Deon Brewis <[hidden email]> wrote:
>
> > Yeah ok, but that is paltry compared with the gb's of diskspace that the
> actual second index takes up. But thanks for clarifying.
>
> Ah.  If it’s really GBs of disk space then I can see why you’d look for
> alternative solutions.
>
> But I have a 43 GB database file which could be 20 GB without an extra
> index.  I could have written that extra check in code, and reduced the file
> size, but I decided not to.  Because once I had developed procedures to
> handle a 20 GB file, I might was well be dealing with a 43 GB file anyway.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Paul Sanderson
Hmm - a bit of google foo and refreshing of chances of collions means my
1:9,223,372,036,854,775,808 is way off

That is the chance of any two hashes colliding - you'll have lot sof hashes.

The basic idea might be sound though even if you stick with a full MD5 it
should save a chunk of storage

Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 1 March 2017 at 22:13, Paul Sanderson <[hidden email]>
wrote:

> As a bit of an off the wall suggestion you could try an MD5 (or even a
> partial MD5 - half of the bytes)
>
> CREATE table hashes (
>     hash integer primary key; // just the first 64 bits of the hash of
> uniquecol and extracol
> )
>
> as an integer primary key the hash would be an alias of the rowid and so
> storage would be 8 bytes plus admin
>
> the chance of a random colliison based on a 64 bit hash would be (I think)
> 1:9,223,372,036,854,775,808
>
> MD5 is broken but would work OK for this
>
> use a trigger to abort the insert into blah if the insert into hashes
> fails.
>
>
> Paul
> www.sandersonforensics.com
> skype: r3scue193
> twitter: @sandersonforens
> Tel +44 (0)1326 572786 <+44%201326%20572786>
> http://sandersonforensics.com/forum/content.php?195-SQLite-
> Forensic-Toolkit -Forensic Toolkit for SQLite
> email from a work address for a fully functional demo licence
>
> On 1 March 2017 at 21:54, Simon Slavin <[hidden email]> wrote:
>
>>
>> On 1 Mar 2017, at 9:41pm, Deon Brewis <[hidden email]> wrote:
>>
>> > Yeah ok, but that is paltry compared with the gb's of diskspace that
>> the actual second index takes up. But thanks for clarifying.
>>
>> Ah.  If it’s really GBs of disk space then I can see why you’d look for
>> alternative solutions.
>>
>> But I have a 43 GB database file which could be 20 GB without an extra
>> index.  I could have written that extra check in code, and reduced the file
>> size, but I decided not to.  Because once I had developed procedures to
>> handle a 20 GB file, I might was well be dealing with a 43 GB file anyway.
>>
>> Simon.
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Deon Brewis
Right, you need to use the birthday attack algorithm for determining collision risk.

I wouldn't mind hashing - but there is an additional complication - the query I want the covered field in the index for is this:

SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end

So I would need a hashing algorithm that's usably small and doesn't collide, yet preserves the properties that if  a < b then  hash(a) < hash(b).

- Deon

-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Paul Sanderson
Sent: Wednesday, March 1, 2017 2:19 PM
To: [hidden email]
Cc: SQLite mailing list <[hidden email]>
Subject: Re: [sqlite] Non-unique columns in unique index

Hmm - a bit of google foo and refreshing of chances of collions means my
1:9,223,372,036,854,775,808 is way off

That is the chance of any two hashes colliding - you'll have lot sof hashes.

The basic idea might be sound though even if you stick with a full MD5 it should save a chunk of storage

Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 1 March 2017 at 22:13, Paul Sanderson <[hidden email]>
wrote:

> As a bit of an off the wall suggestion you could try an MD5 (or even a
> partial MD5 - half of the bytes)
>
> CREATE table hashes (
>     hash integer primary key; // just the first 64 bits of the hash of
> uniquecol and extracol
> )
>
> as an integer primary key the hash would be an alias of the rowid and
> so storage would be 8 bytes plus admin
>
> the chance of a random colliison based on a 64 bit hash would be (I
> think)
> 1:9,223,372,036,854,775,808
>
> MD5 is broken but would work OK for this
>
> use a trigger to abort the insert into blah if the insert into hashes
> fails.
>
>
> Paul
> www.sandersonforensics.com
> skype: r3scue193
> twitter: @sandersonforens
> Tel +44 (0)1326 572786 <+44%201326%20572786>
> http://sandersonforensics.com/forum/content.php?195-SQLite-
> Forensic-Toolkit -Forensic Toolkit for SQLite email from a work
> address for a fully functional demo licence
>
> On 1 March 2017 at 21:54, Simon Slavin <[hidden email]> wrote:
>
>>
>> On 1 Mar 2017, at 9:41pm, Deon Brewis <[hidden email]> wrote:
>>
>> > Yeah ok, but that is paltry compared with the gb's of diskspace
>> > that
>> the actual second index takes up. But thanks for clarifying.
>>
>> Ah.  If it’s really GBs of disk space then I can see why you’d look
>> for alternative solutions.
>>
>> But I have a 43 GB database file which could be 20 GB without an
>> extra index.  I could have written that extra check in code, and
>> reduced the file size, but I decided not to.  Because once I had
>> developed procedures to handle a 20 GB file, I might was well be dealing with a 43 GB file anyway.
>>
>> Simon.
>> _______________________________________________
>> sqlite-users mailing list
>> [hidden email]
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

David Raymond
In reply to this post by Simon Slavin-3
The extra space would just be the few bytes to store the text, which is nothing compared to the "extra" index size. And since it's "infrequent update" then it wouldn't be too bad for the extra time. The normal way would have an extra index to update, the trigger way would be less efficient. On an in-memory test of 10,000,000 updates, the trigger way took 14% more time, and only 2% more when I did it on disk. The "extra" unique index added 36% to the page count.

I will say that the "debugging nightmare" comment is 100% absolutely true though, and is why I'd personally bite the bullet and go with the unique keyword. I'm just pointing out how it would be done.



--just testing output below this line--



SQLite version 3.17.0 2017-02-13 16:02:40
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.

sqlite> create table foo
   ...> (
   ...>   Id integer primary key,
   ...>   UniqueCol int unique,
   ...>   ExtraCol int
   ...> );
Run Time: real 0.000 user 0.000000 sys 0.000000

sqlite> create index fooIndex on foo (UniqueCol, ExtraCol);
Run Time: real 0.000 user 0.000000 sys 0.000000

sqlite> explain query plan insert into foo values (1, 1, 1);
Run Time: real 0.000 user 0.000000 sys 0.000000

sqlite> explain insert into foo values (1, 1, 1);
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     28    0                    00  Start at 28
1     OpenWrite      0     2     0     3              00  root=2 iDb=0; foo
2     OpenWrite      1     4     0     k(3,,,)        00  root=4 iDb=0; fooIndex
3     OpenWrite      2     3     0     k(2,,)         00  root=3 iDb=0; sqlite_autoindex_foo_1
4     Integer        1     1     0                    00  r[1]=1
5     NotNull        1     7     0                    00  if r[1]!=NULL goto 7
6     NewRowid       0     1     0                    00  r[1]=rowid
7     MustBeInt      1     0     0                    00
8     SoftNull       2     0     0                    00  r[2]=NULL
9     Integer        1     3     0                    00  r[3]=1
10    Integer        1     4     0                    00  r[4]=1
11    NotExists      0     13    1                    00  intkey=r[1]
12    Halt           1555  2     0     foo.Id         02
13    Affinity       2     3     0     DDD            00  affinity(r[2..4])
14    SCopy          3     6     0                    00  r[6]=r[3]; UniqueCol
15    SCopy          4     7     0                    00  r[7]=r[4]; ExtraCol
16    IntCopy        1     8     0                    00  r[8]=r[1]; rowid
17    MakeRecord     6     3     5                    00  r[5]=mkrec(r[6..8]); for fooIndex
18    SCopy          3     10    0                    00  r[10]=r[3]; UniqueCol
19    IntCopy        1     11    0                    00  r[11]=r[1]; rowid
20    MakeRecord     10    2     9                    00  r[9]=mkrec(r[10..11]); for sqlite_autoindex_foo_1
21    NoConflict     2     23    10    1              00  key=r[10]
22    Halt           2067  2     0     foo.UniqueCol  02
23    IdxInsert      1     5     6     3              10  key=r[5]
24    IdxInsert      2     9     10    2              10  key=r[9]
25    MakeRecord     2     3     12                   00  r[12]=mkrec(r[2..4])
26    Insert         0     12    1     foo            31  intkey=r[1] data=r[12]
27    Halt           0     0     0                    00
28    Transaction    0     1     2     0              01  usesStmtJournal=0
29    Goto           0     1     0                    00
Run Time: real 0.031 user 0.000000 sys 0.000000



sqlite> create table bar
   ...> (
   ...>   Id integer primary key,
   ...>   UniqueCol int,
   ...>   ExtraCol int
   ...> );
Run Time: real 0.000 user 0.000000 sys 0.000000

sqlite> create index barIndex on bar (UniqueCol, ExtraCol);
Run Time: real 0.000 user 0.000000 sys 0.000000

sqlite> create trigger bar_trg_UniqueCol_insert
   ...> before insert on bar
   ...> for each row
   ...> when exists (select 1 from bar where UniqueCol = new.UniqueCol)
   ...> begin select raise(abort, 'UNIQUE constraint failed: bar.UniqueCol'); end;
Run Time: real 0.000 user 0.000000 sys 0.000000

sqlite> create trigger bar_trg_UniqueCol_update
   ...> before update of UniqueCol on bar
   ...> for each row
   ...> when exists (select 1 from bar where UniqueCol = new.UniqueCol)
   ...> begin select raise(abort, 'UNIQUE constraint failed: bar.UniqueCol'); end;
Run Time: real 0.000 user 0.000000 sys 0.000000

sqlite> explain insert into bar values (1, 1, 1);
addr  opcode         p1    p2    p3    p4             p5  comment
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     33    0                    00  Start at 33
1     OpenWrite      0     5     0     3              00  root=5 iDb=0; bar
2     OpenWrite      1     6     0     k(3,,,)        00  root=6 iDb=0; barIndex
3     Integer        1     9     0                    00  r[9]=1
4     NotNull        9     6     0                    00  if r[9]!=NULL goto 6
5     Integer        -1    9     0                    00  r[9]=-1
6     MustBeInt      9     0     0                    00
7     Integer        1     10    0                    00  r[10]=1
8     Copy           10    13    0                    00  r[13]=r[10]
9     Integer        1     11    0                    00  r[11]=1
10    Copy           11    14    0                    00  r[14]=r[11]
11    Integer        1     12    0                    00  r[12]=1
12    Copy           12    15    0                    00  r[15]=r[12]
13    Affinity       10    3     0     DDD            00  affinity(r[10..12])
14    Program        5     32    16    program        01  Call: bar_trg_UniqueCol_insert.default
15    Copy           13    1     0                    00  r[1]=r[13]
16    NotNull        1     18    0                    00  if r[1]!=NULL goto 18
17    NewRowid       0     1     0                    00  r[1]=rowid
18    MustBeInt      1     0     0                    00
19    SoftNull       2     0     0                    00  r[2]=NULL
20    Copy           14    3     0                    00  r[3]=r[14]
21    Copy           15    4     0                    00  r[4]=r[15]
22    NotExists      0     24    1                    00  intkey=r[1]
23    Halt           1555  2     0     bar.Id         02
24    Affinity       2     3     0     DDD            00  affinity(r[2..4])
25    SCopy          3     6     0                    00  r[6]=r[3]; UniqueCol
26    SCopy          4     7     0                    00  r[7]=r[4]; ExtraCol
27    IntCopy        1     8     0                    00  r[8]=r[1]; rowid
28    MakeRecord     6     3     5                    00  r[5]=mkrec(r[6..8]); for barIndex
29    IdxInsert      1     5     6     3              10  key=r[5]
30    MakeRecord     2     3     17                   00  r[17]=mkrec(r[2..4])
31    Insert         0     17    1     bar            31  intkey=r[1] data=r[17]
32    Halt           0     0     0                    00
33    Transaction    0     1     6     0              01  usesStmtJournal=1
34    Goto           0     1     0                    00
0     Init           0     1     0     -- TRIGGER bar_trg_UniqueCol_insert  00  Start at 1; Start: bar_trg_UniqueCol_insert.default (BEFORE INSERT ON bar)
1     Once           0     13    0                    00
2     Integer        0     2     0                    00  r[2]=0; Init EXISTS result
3     Integer        1     3     0                    00  r[3]=1; LIMIT counter
4     OpenRead       1     6     0     k(3,,,)        02  root=6 iDb=0; barIndex
5     Param          6     4     0                    00  new.UniqueCol -> $4
6     IsNull         4     13    0                    00  if r[4]==NULL goto 13
7     Affinity       4     1     0     D              00  affinity(r[4])
8     SeekGE         1     13    4     1              00  key=r[4]
9       IdxGT          1     13    4     1              00  key=r[4]
10      Integer        1     2     0                    00  r[2]=1
11      DecrJumpZero   3     13    0                    00  if (--r[3])==0 goto 13
12    Next           1     9     0                    00
13    IfNot          2     15    1                    00
14    Halt           1811  2     0     UNIQUE constraint failed: bar.UniqueCol  00
15    Halt           0     0     0                    00  End: bar_trg_UniqueCol_insert.default
Run Time: real 0.063 user 0.000000 sys 0.015600


--saved to disk and vacuumed here

sqlite> with recursive x (Id, UniqueCol, ExtraCol) as (values (1, 1, 1) union all select Id + 1, UniqueCol + 1, ExtraCol + 1 from x limit 10000000) insert into foo select * from x;
--EQP-- 3,0,0,SCAN TABLE x
--EQP-- 1,0,0,COMPOUND SUBQUERIES 0 AND 0 (UNION ALL)
--EQP-- 0,0,0,SCAN SUBQUERY 1
Run Time: real 26.020 user 21.762140 sys 1.232408

sqlite> with recursive x (Id, UniqueCol, ExtraCol) as (values (1, 1, 1) union all select Id + 1, UniqueCol + 1, ExtraCol + 1 from x limit 10000000) insert into bar select * from x;
--EQP-- 3,0,0,SCAN TABLE x
--EQP-- 1,0,0,COMPOUND SUBQUERIES 0 AND 0 (UNION ALL)
--EQP-- 0,0,0,SCAN SUBQUERY 1
Run Time: real 26.535 user 23.914953 sys 1.388409

sqlite> analyze;
Run Time: real 3.557 user 2.652017 sys 0.904806

sqlite> vacuum;
Run Time: real 41.371 user 12.043277 sys 8.408454

--ran sqlite3_analyzer
*** Page counts for all tables with their indices *****************************

FOO............................................... 112646      57.7%
BAR............................................... 82440       42.3%
SQLITE_MASTER..................................... 1            0.0%
SQLITE_STAT1...................................... 1            0.0%

*** Page counts for all tables and indices separately *************************

BAR............................................... 42030       21.5%
FOO............................................... 42030       21.5%
BARINDEX.......................................... 40410       20.7%
FOOINDEX.......................................... 40410       20.7%
SQLITE_AUTOINDEX_FOO_1............................ 30206       15.5%
SQLITE_MASTER..................................... 1            0.0%
SQLITE_STAT1...................................... 1            0.0%


-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of Simon Slavin
Sent: Wednesday, March 01, 2017 4:38 PM
To: SQLite mailing list
Subject: Re: [sqlite] Non-unique columns in unique index


On 1 Mar 2017, at 9:11pm, Deon Brewis <[hidden email]> wrote:

> "But look how much space and processing time it would take up"
>
> Can you clarify what you mean by "space" ?

Your triggers have to be stored.  Every time they’re activated (each time you insert a row) they have to be run.  That requires SQLite to grab enough memory and/or disk space to run the triggers and to construct any temporary indexes they require.

Also, it’s a nightmare to debug in comparison with just creating one extra index.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Keith Medcalf
In reply to this post by Deon Brewis

On Wednesday, 1 March, 2017 14:12, Deon Brewis <[hidden email]> wrote:

> "But look how much space and processing time it would take up"
 
> Can you clarify what you mean by "space" ?
 
> The processing time argument I understand.

I doubt there is any "space" requirement at all.  In fact, since you are not carrying an extra index, the space requirement will be significantly reduced.  

Your only real concern would be that on INSERT and UPDATE operations the _prepare would take a little longer since it would have to retrieve and generate the trigger code for the statement.  You pay the price during statement prepare, but I doubt that the actual statement execution time will be much affected (the execution would have to do a "lookup" in the unique index anyway to determine whether the operation meets the unique requirement) as you are merely substitution one operation for another like operation.  Of course, the internal unique check is done entirely inline, whereas the trigger method is done by running additional VBDE code -- though of course the overall time to run the trigger may approximate the time taken to maintain an the extra index -- in which case you would save the space used by the extra index and have no (or a minor) execution time penalty on insert/update operations.

> I think this is one of those things that if the database engine doesn't
> internally support it, it can't really be emulated.
>
> Sure would be nice to have INCLUDE columns support (here and in other
> places).
>
> - Deon
>
> -----Original Message-----
> From: sqlite-users [mailto:[hidden email]]
> On Behalf Of Simon Slavin
> Sent: Wednesday, March 1, 2017 12:57 PM
> To: SQLite mailing list <[hidden email]>
> Subject: Re: [sqlite] Non-unique columns in unique index
>
>
> On 1 Mar 2017, at 8:21pm, David Raymond <[hidden email]> wrote:
>
> > The trigger version you asked about would look something like the below
> I believe. More risky than having the two indexes, but should work.
> (Famous last words)
>
> I have no doubt it would work.  But look how much space and processing
> time it would take up.  Far simpler and clearer just to create the two
> indexes.
>
> Simon.
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

R Smith
In reply to this post by Deon Brewis
Any Index that starts with a Unique column will by definition be Unique.
Of course in your case you want the Uniqueness of only the first column
to be enforced, but you want to lookup also using the second column
(either or both). Why insist on having it in a covering Index though?

Why not just make one Unique index, and one other Index on the other
column by itself? Like so:

+CREATE TABLE Blah(
     Id Integer Primary Key,
     UniqueCol blob, // 20 bytes fixed
     ExtraCol blob, // 12 bytes fixed
     UNIQUE(UniqueCol)
);  // ~36 bytes/row

CREATE INDEX Blah_ExtraIdx on Blah(ExtraCol);


This will be close to the same size of one covering Index.
The Query planner is clever enough to figure out it can use only the second Index to look up some queries, it can also use the Unique Index if needed for a query and it will use it to enforce uniqueness.

You get all the speed and a little extra overhead space, but much less than a covering Index.

This Query (for instance) will be exceedingly fast:
SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end

So would this:
SELECT * FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end AND ExtraCol = xxx;

And this:
SELECT * FROM Foo WHERE ExtraCol = xxx;

All of these will be super fast and your Indices will take up the least possible space.

There is some saving in cycles if you can read the bytes directly out of a covering Index rather than a big table, but this is no big table, it should be real quick. If you really really really need those few cycles saved, invest the MBs and make the covering Index additional. If space is a problem, use only the single Unique index.

You cannot optimize for everything, pick your favourite thing and optimize for that.

Cheers,
Ryan

On 2017/03/01 7:00 PM, Deon Brewis wrote:

> Is there way to add non-unique columns in a unique index?
>
> I would like to use the same index to enforce unique constraints, as well as giving a covered result for other queries.
>
> Something like an 'INCLUDE' would also work (actually even better). E.g.
>
> CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)
>
>
> If not, is there a way to efficiently implement a UNIQUE constraint in a different way? (Trigger maybe?)
>
> -Deon
>  
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Deon Brewis
"Any Index that starts with a Unique column will by definition be Unique."
Sorry, yet, I admit the title of the thread is confusing. What I meant to say is that it's TOO unique :). Adding the additional columns will allow duplicates on the columns where duplicates should not be allowed.


"This Query (for instance) will be exceedingly fast:
SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end"

No, that's not covered. I've tried that before, that query is too slow when it isn't covered - the table is many GB's and the result is needed in low milliseconds. Also, I don't need an index directly on ExtraCol like specified below for the ( WHERE ExtraCol = ...) case . It's not useful a useful lookup by itself - only a useful result. So the only reason to include it in an index is to make it covered.


"You cannot optimize for everything, pick your favourite thing and optimize for that."

This is a bit of a call to support INCLUDE columns in indexes. This optimization is very easy in SQL Server using that.

- Deon

-----Original Message-----
From: sqlite-users [mailto:[hidden email]] On Behalf Of R Smith
Sent: Thursday, March 2, 2017 2:50 AM
To: [hidden email]
Subject: Re: [sqlite] Non-unique columns in unique index

Any Index that starts with a Unique column will by definition be Unique.
Of course in your case you want the Uniqueness of only the first column to be enforced, but you want to lookup also using the second column (either or both). Why insist on having it in a covering Index though?

Why not just make one Unique index, and one other Index on the other column by itself? Like so:

+CREATE TABLE Blah(
     Id Integer Primary Key,
     UniqueCol blob, // 20 bytes fixed
     ExtraCol blob, // 12 bytes fixed
     UNIQUE(UniqueCol)
);  // ~36 bytes/row

CREATE INDEX Blah_ExtraIdx on Blah(ExtraCol);


This will be close to the same size of one covering Index.
The Query planner is clever enough to figure out it can use only the second Index to look up some queries, it can also use the Unique Index if needed for a query and it will use it to enforce uniqueness.

You get all the speed and a little extra overhead space, but much less than a covering Index.

This Query (for instance) will be exceedingly fast:
SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end

So would this:
SELECT * FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end AND ExtraCol = xxx;

And this:
SELECT * FROM Foo WHERE ExtraCol = xxx;

All of these will be super fast and your Indices will take up the least possible space.

There is some saving in cycles if you can read the bytes directly out of a covering Index rather than a big table, but this is no big table, it should be real quick. If you really really really need those few cycles saved, invest the MBs and make the covering Index additional. If space is a problem, use only the single Unique index.

You cannot optimize for everything, pick your favourite thing and optimize for that.

Cheers,
Ryan

On 2017/03/01 7:00 PM, Deon Brewis wrote:

> Is there way to add non-unique columns in a unique index?
>
> I would like to use the same index to enforce unique constraints, as well as giving a covered result for other queries.
>
> Something like an 'INCLUDE' would also work (actually even better). E.g.
>
> CREATE UNIQUE INDEX indx  ON Foo(UniqueCol) INCLUDE (ExtraCol)
>
>
> If not, is there a way to efficiently implement a UNIQUE constraint in
> a different way? (Trigger maybe?)
>
> -Deon
>  
> _______________________________________________
> sqlite-users mailing list
> [hidden email]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

R Smith


On 2017/03/02 2:29 PM, Deon Brewis wrote:
> "This Query (for instance) will be exceedingly fast:
> SELECT ExtraCol FROM Foo WHERE UniqueCol > range_start AND UniqueCol < range_end"
>
> No, that's not covered. I've tried that before, that query is too slow when it isn't covered - the table is many GB's and the result is needed in low milliseconds. Also, I don't need an index directly on ExtraCol like specified below for the ( WHERE ExtraCol = ...) case . It's not useful a useful lookup by itself - only a useful result. So the only reason to include it in an index is to make it covered.

Ah yes, so you /do/ need every bit of work cycle saving you can get....

> "You cannot optimize for everything, pick your favourite thing and optimize for that."
>
> This is a bit of a call to support INCLUDE columns in indexes. This optimization is very easy in SQL Server using that.

Well yes, so in SQLite you can optimize either for the speed or for the
space, not for both. In MSSQL you can optimize for both using INCLUDE,
and I can see how it is useful in specifically your case.

That said, I doubt it would make it into development code for SQLite at
this point for the simple reason that it adds weight to the engine
(adding INCLUDE) for a very fringe-case optimization.
Perhaps there are other reasons to add INCLUDE that are less fringe-case?


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Non-unique columns in unique index

Hick Gunter
In reply to this post by Deon Brewis
I tried to create a test table and two indices thus:

>create temp table test (id integer primary key, name text unique, bs integer, data text);
>create unique index plain on test(name);
>create unique index cover on test(name,bs);

NB: The field name has a unique constraint

As long as the query mentions the additional field, sqlite chooses the covering index.

>explain query plan select name,bs from test where name='test' and bs=1;
>sele  order          from  deta
>----  -------------  ----  ----
>0     0              0     SEARCH TABLE test USING COVERING INDEX cover (name=? AND bs=?) (~1 rows)

unfortunately if thsi field is dropped from the where clause but remains in the select list, sqlite misses the covering index

>explain query plan select name,bs from test where name='test';
>sele  order          from  deta
>----  -------------  ----  ----
>0     0              0     SEARCH TABLE test USING INDEX plain (name=?) (~1 rows)

not having an index at alls causes sqlite to create an automatic index

>drop index cover;
>drop index plain;
>explain query plan select name,bs from test where name='test';
>sele  order          from  deta
>----  -------------  ----  ----
>0     0              0     SEARCH TABLE test USING INDEX sqlite_autoindex_test_1 (name=?) (~1 rows)

just the same if the index is not unique (even though the first field is)

>create index plain on test(name);
>create index cover on test(name,bs);
>explain query plan select name,bs from test where name='test';
>sele  order          from  deta
>----  -------------  ----  ----
>0     0              0     SEARCH TABLE test USING INDEX sqlite_autoindex_test_1 (name=?) (~1 rows)

I guess such a schema would be quite unusual, so this optimization opportunity is missed

-----Ursprüngliche Nachricht-----
>"Any Index that starts with a Unique column will by definition be Unique."
>Sorry, yet, I admit the title of the thread is confusing. What I meant to say is that it's TOO unique :). Adding the additional columns will allow duplicates on the columns where duplicates should not be allowed.

I don't think so. There can only be one entry for the unique field, and the extra column captures the corresponding value of the one and only matching record.



___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: [hidden email]

This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation.


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
12
Loading...