UPSERT

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

UPSERT

Robert M. Münch
Hi, is there an elegant way to simulate a dynamic UPSERT in SQLite since it doesn’t has an UPSERT?

So, if I have a table with 30 columns and my code updates sub-sets out of these columns, I don’t want to write queries that manually retrieve the old values one by one.

insert or replace into Book (ID, Name, TypeID, Level, Seen) values (
        (select ID from Book where Name = "SearchName"),
   "SearchName",
    5,
    6,
    (select Seen from Book where Name = "SearchName"));

So, for every column I don’t want to change I need to add a sub-select statement. If I need to build this statement dynamically, IMO it would be better to handle this code directly in code:

if(record-exists?){
        UPDATE …
} else {
        INSERT …
}

Any suggestions / feedback?

Viele Grüsse.

--

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

signature.asc (537 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [EXTERNAL] UPSERT

Hick Gunter
No. Insert or replace will delete any and all records that violate constraints, and then insert a new record. This may cause ON DELETE CASCADE foreign key constraints to fire.

Do it in code as you suggested yourself.

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:[hidden email]] Im Auftrag von Robert M. Münch
Gesendet: Freitag, 16. März 2018 11:25
An: SQLite mailing list <[hidden email]>
Betreff: [EXTERNAL] [sqlite] UPSERT

Hi, is there an elegant way to simulate a dynamic UPSERT in SQLite since it doesn’t has an UPSERT?

So, if I have a table with 30 columns and my code updates sub-sets out of these columns, I don’t want to write queries that manually retrieve the old values one by one.

insert or replace into Book (ID, Name, TypeID, Level, Seen) values (
        (select ID from Book where Name = "SearchName"),
   "SearchName",
    5,
    6,
    (select Seen from Book where Name = "SearchName"));

So, for every column I don’t want to change I need to add a sub-select statement. If I need to build this statement dynamically, IMO it would be better to handle this code directly in code:

if(record-exists?){
        UPDATE …
} else {
        INSERT …
}

Any suggestions / feedback?

Viele Grüsse.

--

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Simon Slavin-3
In reply to this post by Robert M. Münch


On 16 Mar 2018, at 10:24am, Robert M. Münch <[hidden email]> wrote:

> for every column I don’t want to change I need to add a sub-select statement. If I need to build this statement dynamically, IMO it would be better to handle this code directly in code:
>
> if(record-exists?){
> UPDATE …
> } else {
> INSERT …
> }

You can use INSERT OR IGNORE for this.  First, do an INSERT OR IGNORE command which will add a dummy entry, with the right key, if one doesn't already exist.  Then use an UPDATE command to update the row which now definitely exists.  If you wrap the two commands in a transaction then even if you get power-failure or crash, you will never get a dummy row with no data in the database:

    BEGIN
        INSERT OR IGNORE <get the key values correct here>
        UPDATE <set the data values here>
    COMMIT

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Rowan Worth-2
In reply to this post by Robert M. Münch
On 16 March 2018 at 18:24, Robert M. Münch <[hidden email]>
wrote:

> Hi, is there an elegant way to simulate a dynamic UPSERT in SQLite since
> it doesn’t has an UPSERT?
>
> So, if I have a table with 30 columns and my code updates sub-sets out of
> these columns, I don’t want to write queries that manually retrieve the old
> values one by one.
>
> insert or replace into Book (ID, Name, TypeID, Level, Seen) values (
>         (select ID from Book where Name = "SearchName"),
>    "SearchName",
>     5,
>     6,
>     (select Seen from Book where Name = "SearchName"));
>
> So, for every column I don’t want to change I need to add a sub-select
> statement. If I need to build this statement dynamically, IMO it would be
> better to handle this code directly in code:
>
> if(record-exists?){
>         UPDATE …
> } else {
>         INSERT …
> }
>

Doing it with an if means you always run two queries - the first to
determine whether a row exists:

SELECT EXISTS (SELECT ID from Book where Name = "SearchName")

There's two approaches which reduce the best case to a single query:

1. Immediately try to INSERT using the ON ABORT conflict strategy (ie.the
default). If the query fails with SQLITE_CONSTRAINT you know the row is
already present, so run the UPDATE.
2. Immediately try to UPDATE the existing row. Then call sqlite3_changes to
determine how many rows were updated - if zero then you know the row didn't
exist, so run the INSERT.

Whether this makes a significant difference in practice I don't know :)
-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Olivier Mascia
In reply to this post by Simon Slavin-3
> Le 16 mars 2018 à 11:35, Simon Slavin <[hidden email]> a écrit :
>
>> for every column I don’t want to change I need to add a sub-select statement. If I need to build this statement dynamically, IMO it would be better to handle this code directly in code:
>>
>> if(record-exists?){
>> UPDATE …
>> } else {
>> INSERT …
>> }
>
> You can use INSERT OR IGNORE for this.  First, do an INSERT OR IGNORE command which will add a dummy entry, with the right key, if one doesn't already exist.  Then use an UPDATE command to update the row which now definitely exists.  If you wrap the two commands in a transaction then even if you get power-failure or crash, you will never get a dummy row with no data in the database:
>
>    BEGIN
>        INSERT OR IGNORE <get the key values correct here>
>        UPDATE <set the data values here>
>    COMMIT

I often do so, slightly differently, passing in all the appropriate values at the INSERT OR IGNORE stage, and testing its outcome through sqlite3_changes() to learn if the insert did insert or ignore. Allows to run the update only if the insert did nothing.

SQLite is an engine in a library and not a black-box-server to which you could only "talk" through pure SQL, so I always like to think of the database processing as a collaborative merge of the host code and SQLite features.

The syntax UPDATE table set (X,Y,Z) = (V1,V2,V3) greatly helps in helper layers to prepare the UPDATE in a syntactic way close to the INSERT INTO table(X,Y,Z) values(V1,V2,V3). Only have to account for the WHERE clause of the UPDATE.

--
Best Regards, Meilleures salutations, Met vriendelijke groeten,
Olivier Mascia


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Robert M. Münch
In reply to this post by Simon Slavin-3
On 16 Mar 2018, at 11:35, Simon Slavin wrote:

> You can use INSERT OR IGNORE for this.  First, do an INSERT OR IGNORE command which will add a dummy entry, with the right key, if one doesn't already exist.
> …

Hi, yes, I thought about this too but really don’t like it and this assumes that you don’t use AUTOINCREMENT on a table otherwise this strategy will fail.

Viele Grüsse.

--

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

signature.asc (537 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Simon Slavin-3
On 16 Mar 2018, at 12:43pm, Robert M. Münch <[hidden email]> wrote:

> On 16 Mar 2018, at 11:35, Simon Slavin wrote:
>
>> You can use INSERT OR IGNORE for this.  First, do an INSERT OR IGNORE command which will add a dummy entry, with the right key, if one doesn't already exist.
>> …
>
> Hi, yes, I thought about this too but really don’t like it and this assumes that you don’t use AUTOINCREMENT on a table otherwise this strategy will fail.

There's no problem with AUTOINCREMENT.  Your UPDATE command should be using your own key values, not a rowid generated automatically.

But yes, if you have other objections against using INSERT OR IGNORE then there are alternatives.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Robert M. Münch
In reply to this post by Rowan Worth-2
On 16 Mar 2018, at 11:41, Rowan Worth wrote:

> Doing it with an if means you always run two queries - the first to
> determine whether a row exists:
>
> SELECT EXISTS (SELECT ID from Book where Name = "SearchName")

Hi, yes that’s true. Not critical in my case but if you have a slow FFI in between this is not ideal.

> There's two approaches which reduce the best case to a single query:
>
> 1. Immediately try to INSERT using the ON ABORT conflict strategy (ie.the
> default). If the query fails with SQLITE_CONSTRAINT you know the row is
> already present, so run the UPDATE.

Yes. In my case we crash hard on any SQLite problems as such a constraint violation shouldn’t happen at all and if indicates an invalid application state.

> 2. Immediately try to UPDATE the existing row. Then call sqlite3_changes to
> determine how many rows were updated - if zero then you know the row didn't
> exist, so run the INSERT.

That’s a nice one. I’m going to try this. At least you don’t run two queries but you need two calls, replacing the query for the #of-rows-changed query, which should be quicker.

Viele Grüsse.

--

Robert M. Münch, CEO
M: +41 79 65 11 49 6

Saphirion AG
smarter | better | faster

http://www.saphirion.com
http://www.nlpp.ch

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

signature.asc (537 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Paul
In reply to this post by Robert M. Münch
A few years back I've been asking the same question. To be honest, there's no more  
efficient alternative, than the one that can be implemented within library itself.
Both performance-wise and productivity-wise.

Doing hacks with INSERT + UPDATE or UPDATE + INSERT leaves us with problems:
 * Which strategy to choose, INSERT + UPDATE or the reverse? No way to make this generic without hurting performance
 * No matter the strategy, we end up with two queries which leads to extra code that has to be maintained and kept in sync plus a penalty from preparing two statements
 * Existence of two statements leaves us vulnerable to race conditions, which adds two extra statements to BEGIN and COMMIT a transaction

Even if for some reason we dismiss all of the said above, UPSERT scenario is waaaay to common.
Society wise it's much more efficient to make a change in the library for a convenience of
thousands (or millions?) of library users rather than leaving it up for them to figure out.


Best regards,
Paul

16 March 2018, 12:25:06, by "Robert M. Münch" <[hidden email]>:

> Hi, is there an elegant way to simulate a dynamic UPSERT in SQLite since it doesn’t has an UPSERT?
>
> So, if I have a table with 30 columns and my code updates sub-sets out of these columns, I don’t want to write queries that manually retrieve the old values one by one.
>
> insert or replace into Book (ID, Name, TypeID, Level, Seen) values ( (select ID from Book where Name = "SearchName"),
>    "SearchName",
>     5,
>     6,
>     (select Seen from Book where Name = "SearchName"));
>
> So, for every column I don’t want to change I need to add a sub-select statement. If I need to build this statement dynamically, IMO it would be better to handle this code directly in code:
>
> if(record-exists?){ UPDATE …
> } else { INSERT …
> }
>
> Any suggestions / feedback?
>
> Viele Grüsse.
>
> --
>
> Robert M. Münch, CEO
> M: +41 79 65 11 49 6
>
> Saphirion AG
> smarter | better | faster
>
> http://www.saphirion.comhttp://www.nlpp.ch
> _______________________________________________
> sqlite-users mailing list
> [hidden email]://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Rowan Worth-2
On 16 March 2018 at 21:44, Paul <[hidden email]> wrote:

> A few years back I've been asking the same question. To be honest, there's
> no more
> efficient alternative, than the one that can be implemented within library
> itself.
> Both performance-wise and productivity-wise.
>
> Doing hacks with INSERT + UPDATE or UPDATE + INSERT leaves us with
> problems:
>  * Which strategy to choose, INSERT + UPDATE or the reverse? No way to
> make this generic without hurting performance
>  * No matter the strategy, we end up with two queries which leads to extra
> code that has to be maintained and kept in sync plus a penalty from
> preparing two statements
>  * Existence of two statements leaves us vulnerable to race conditions,
> which adds two extra statements to BEGIN and COMMIT a transaction
>

I agree with your overall sentiment, but BEGIN/COMMIT actually eliminates
two statements because in the standard mode of operation (ie. autocommit)
you're essentially doing:

(implicit) BEGIN
INSERT ...
(implicit) COMMIT
(implicit) BEGIN
UPDATE ...
(implicit) COMMIT

By making the BEGIN/COMMIT explicit you reduce the overall work when two
statements are required.

It does seem like sqlite could avoid an extra btree lookup if it
implemented UPSERT itself, but since the required pages are practically
guaranteed to be in cache for the second query I wonder how many rows you'd
need in a table for it to make a significant difference. As you say the
main benefit would be to avoid synthesising two statements in user code.

-Rowan
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Paul


19 March 2018, 09:26:15, by "Rowan Worth" <[hidden email]>:

> On 16 March 2018 at 21:44, Paul <> [hidden email]> wrote:
>
> > A few years back I've been asking the same question. To be honest, there's
> > no more
> > efficient alternative, than the one that can be implemented within library
> > itself.
> > Both performance-wise and productivity-wise.
> >
> > Doing hacks with INSERT + UPDATE or UPDATE + INSERT leaves us with
> > problems:
> >  * Which strategy to choose, INSERT + UPDATE or the reverse? No way to
> > make this generic without hurting performance
> >  * No matter the strategy, we end up with two queries which leads to extra
> > code that has to be maintained and kept in sync plus a penalty from
> > preparing two statements
> >  * Existence of two statements leaves us vulnerable to race conditions,
> > which adds two extra statements to BEGIN and COMMIT a transaction
> >
>
> I agree with your overall sentiment, but BEGIN/COMMIT actually eliminates
> two statements because in the standard mode of operation (ie. autocommit)
> you're essentially doing:
>
> (implicit) BEGIN
> INSERT ...
> (implicit) COMMIT
> (implicit) BEGIN
> UPDATE ...
> (implicit) COMMIT
>
> By making the BEGIN/COMMIT explicit you reduce the overall work when two
> statements are required.

I agree with you here. But then again you have to issue both BEGIN and COMMIT
through SQL statements, ie through the parser. Even if overhead is small, it's
still present. It all depends on the scenario that database is used in.
If you have a lot of scenarios when you INSERT or UPDATE some data quite frequently
then overhead is visible. By manually tweaking strategies in various places
(whether to use UPDATE first or INSERT) I've managed to improve performance
by tens of percents. Then again, it depends on scenario. Users that use this
model not very often will definitely not benefit that much from UPSERT.
But the ability to work around, and potentially small benefit to an average
user should not be the arguments to dismiss its implementation. After all,
half of the features in SQLite3 are not that useful to an average user.

>
> It does seem like sqlite could avoid an extra btree lookup if it
> implemented UPSERT itself, but since the required pages are practically
> guaranteed to be in cache for the second query I wonder how many rows you'd
> need in a table for it to make a significant difference. As you say the
> main benefit would be to avoid synthesising two statements in user code.

It's easy to calculate, exactly twice as much time as it takes to do a B-Tree
lookup. How big is the piece of the pie, again, depends on the scenario.

Fort me personally, the most sad thing is an annoyance. Because I have to
maintain two almost identical queries and manually tweak strategies.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

R Smith-2

On 2018/03/19 10:36 AM, Paul wrote:
> Fort me personally, the most sad thing is an annoyance. Because I have to
> maintain two almost identical queries and manually tweak strategies.

I think that there presents the entire UPSERT argument's only real
motivation, because it really is negligible from a Database performance
point of view.  Never the less, many a lazy programmer do yearn for it
(I don't specifically need it, but I am a lazy programmer too, and
probably would use it if it existed!), and it seems trivial to implement
db-side.

We keep going in circles between those of us explaining why it's not
really better and those of us who really feel warranted in asking for it.

I never see a dev speak up on this - Is there any dev thoughts on why
this isn't considered yet?

I mean, do we keep lobbying for it, or is there a reason we should all
just drop it (and that we might explain to the next person coming on
here and asking for it)?.


Cheers,
Ryan

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Olivier Mascia
> Le 19 mars 2018 à 11:28, R Smith <[hidden email]> a écrit :
>
> On 2018/03/19 10:36 AM, Paul wrote:
>> Fort me personally, the most sad thing is an annoyance. Because I have to
>> maintain two almost identical queries and manually tweak strategies.
>
> I think that there presents the entire UPSERT argument's only real motivation, because it really is negligible from a Database performance point of view.  Never the less, many a lazy programmer do yearn for it (I don't specifically need it, but I am a lazy programmer too, and probably would use it if it existed!), and it seems trivial to implement db-side.
>
> We keep going in circles between those of us explaining why it's not really better and those of us who really feel warranted in asking for it.
>
> I never see a dev speak up on this - Is there any dev thoughts on why this isn't considered yet?
>
> I mean, do we keep lobbying for it, or is there a reason we should all just drop it (and that we might explain to the next person coming on here and asking for it)?.

I don't know what any 'standard' SQL defines about this.
I know that FirebirdSQL (where I came from, before meeting SQLite) did/does it this way:

UPDATE OR INSERT INTO
   {tablename | viewname} [(<columns>)]
   VALUES (<values>)
   [MATCHING (<columns>)]
   [RETURNING <values> [INTO <variables>]]

Setting aside the RETURNING option and also its INTO clause only meant for stored procedures, clearly the syntax UPDATE OR INSERT gets its feet on SQLite UPDATE OR ROLLBACK/ABORT/REPLACE/FAIL/IGNORE construction which have absolutely not the same kind of semantic attached to them (these are conflict resolutions on verb UPDATE).  This probably complicates things significantly, and tend to weight on the 'Lite' part of the name 'SQLite'.

I would have like to have it in SQLite at some point in past time (would have made the transition easier), though learned easily to live without it. :)

--
Best Regards, Meilleures salutations, Met vriendelijke groeten,
Olivier Mascia


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Peter da Silva
In reply to this post by Paul
On 3/19/18, 3:37 AM, "sqlite-users on behalf of Paul" <[hidden email] on behalf of [hidden email]> wrote:
> Fort me personally, the most sad thing is an annoyance. Because I have to maintain two almost identical queries and manually tweak strategies.

I almost always generate queries dynamically if they're "almost identical". Only one-offs get handcrafted queries.

_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

R Smith-2
In reply to this post by Olivier Mascia
On 2018/03/19 1:50 PM, Olivier Mascia wrote:
>
> I don't know what any 'standard' SQL defines about this.
> I know that FirebirdSQL (where I came from, before meeting SQLite) did/does it this way:
>
> UPDATE OR INSERT INTO
>     {tablename | viewname} [(<columns>)]
>     VALUES (<values>)
>     [MATCHING (<columns>)]
>     [RETURNING <values> [INTO <variables>]]

Quite right, and the statement in MSSQL is even more convoluted, which,
if it was in SQLite like this, would require a dynamically created SQL
statement that is worse than simply computing an UPDATE and an INSERT -
which a previous poster already lamented.

My suggestion for UPSERT would be the very simple already SQLite-like
syntax of:

INSERT OR UPDATE INTO t (k1, k2, ... , kn,  f1, f2, ... , fn)
followed by the usual VALUES clause or SELECT query.

Any record found to exist with the exact same value in the Primary Key
field(s) [ k1 .. kn ] has all other fields (that are NOT Primary Key
fields) updated to the new values, and if no such record is found, the
row simply gets inserted.  If the inserted row OR updated values cause
any other constraint to break, then FAIL hard, the same way (and
possibly with the same ON CONFLICT options) as any other single INSERT
or UPDATE would be subjected to.

This is far better than INSERT OR REPLACE since there is no delete, and
no multiple-row delete on constraint violations.
It is simple in terms of converting any current INSERT OR REPLACE query
to an INSERT OR UPDATE query requires changing 1 word only.

Triggers should fire for ON INSERT and ON UPDATE according to whatever
actually is required during the operation.

Adding this has no backward compatibility to break, this did not exist
before and it is not schema-specific.


One possible added refinement might be an optional second field-group
that should be ignored over-and-above the PK fields during the UPDATE.
(During the INSERT of course all fields MUST be added).

2 ways this can be done easily:

  A - Use a separate 2nd prototype group for Non-Updating fields, Like
this perhaps:

INSERT OR UPDATE INTO t (pk1, pk2, f1, f2, f3, f4) NOT (f1, f2) VALUES
(...);  -- This example updates only f3 and f4 if the record already exists.

I just picked "NOT" as the separator, perhaps "KEEP" gives better
clarity (see next example), but any good word would do.
Primary key fields pk1 and pk2 along with specified non-updating fields
f1 and f2 are all ignored during an update, but still used during an
insert.
Adding a PK field to the second set is a no-op as some might like it for
legibility. i.e this next query is equivalent to the above:

INSERT OR UPDATE INTO t (pk1, pk2, f1, f2, f3, f4) KEEP (pk1, pk2, f1,
f2) VALUES (...);  -- This example updates only f3 and f4, same as above.


  B - Use a Marker of sorts for Non-Updating fields, Like this perhaps
using the Exclamation mark:

INSERT OR UPDATE INTO t (pk1, !pk2, !f1, !f2, f3, f4) VALUES (...);  --
Again update only f3 and f4 if the record already exists.

(Adding the marker to a PK field is a no-op).
Escaping is not needed since a fieldname starting with the same marker
will be in the list of field-names, no ambiguity, and in the case where
a set of fields contain fields starting with both one and two markers
(for which the programmer should be shot, but let's assume it possible)
then the field can simply be enclosed in quotes as is the norm for
disambiguation in SQLite. This next example has fields named !f and !!f:

INSERT OR UPDATE INTO t (pk1, pk2, !"!f", !!f) VALUES (...);  -- Here
updating only !!f if the record already exists.


Personally, I'm partial to option A.

I know it's a bit of work, but it seems less so than many of the other
additions - perhaps let's first have another show-of-hands to see if
this a real need, but it is asked for here more frequently than any
other feature (to my perception at least).


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Paul
I would suggest using the PostgreSQL way:
  https://www.postgresql.org/docs/9.5/static/sql-insert.html

 INSERT INTO ...
   ON CONFLICT [(<column name>)] DO UPDATE
   SET foo = ... , bar = ... ;

This approach is really cool, because we can specify which key is more
important and discard other conflicts as an error. For example, given
the following table:

CREATE TABLE foo(
    id                INTEGER NOT NULL,
    foo_key           TEXT NOT NULL,
    some_data         TEXT,

    PRIMARY KEY(id),
    UNIQUE (foo_key)
);

INSERT INTO foo(id, foo_key, some_data) VALUES(1, "XXX", "...");
INSERT INTO foo(id, foo_key, some_data) VALUES(2, "YYY", "...");

If we are performing a query:


INSERT INTO foo(id, foo_key, some_data) VALUES(1, "YYY", "...")
 <OR UPDATE ...>

Which record should we update and what columns?

Having the ability to specify a specific column on which the conflict
is actually an acceptable event lets the developer to make a decision
how to resolve it:

INSERT INTO foo(id, foo_key, some_data) VALUES(1, "YYY", "...")
 <ON CONFLICT (foo_key) DO UPDATE SET some_data = "...">


19 March 2018, 18:41:34, by "R Smith" <[hidden email]>:

> On 2018/03/19 1:50 PM, Olivier Mascia wrote:
> >
> > I don't know what any 'standard' SQL defines about this.
> > I know that FirebirdSQL (where I came from, before meeting SQLite) did/does it this way:
> >
> > UPDATE OR INSERT INTO
> >     {tablename | viewname} [(<columns>)]
> >     VALUES (<values>)
> >     [MATCHING (<columns>)]
> >     [RETURNING <values> [INTO <variables>]]
>
> Quite right, and the statement in MSSQL is even more convoluted, which,
> if it was in SQLite like this, would require a dynamically created SQL
> statement that is worse than simply computing an UPDATE and an INSERT -
> which a previous poster already lamented.
>
> My suggestion for UPSERT would be the very simple already SQLite-like
> syntax of:
>
> INSERT OR UPDATE INTO t (k1, k2, ... , kn,  f1, f2, ... , fn)
> followed by the usual VALUES clause or SELECT query.
>
> Any record found to exist with the exact same value in the Primary Key
> field(s) [ k1 .. kn ] has all other fields (that are NOT Primary Key
> fields) updated to the new values, and if no such record is found, the
> row simply gets inserted.  If the inserted row OR updated values cause
> any other constraint to break, then FAIL hard, the same way (and
> possibly with the same ON CONFLICT options) as any other single INSERT
> or UPDATE would be subjected to.
>
> This is far better than INSERT OR REPLACE since there is no delete, and
> no multiple-row delete on constraint violations.
> It is simple in terms of converting any current INSERT OR REPLACE query
> to an INSERT OR UPDATE query requires changing 1 word only.
>
> Triggers should fire for ON INSERT and ON UPDATE according to whatever
> actually is required during the operation.
>
> Adding this has no backward compatibility to break, this did not exist
> before and it is not schema-specific.
>
>
> One possible added refinement might be an optional second field-group
> that should be ignored over-and-above the PK fields during the UPDATE.
> (During the INSERT of course all fields MUST be added).
>
> 2 ways this can be done easily:
>
>   A - Use a separate 2nd prototype group for Non-Updating fields, Like
> this perhaps:
>
> INSERT OR UPDATE INTO t (pk1, pk2, f1, f2, f3, f4) NOT (f1, f2) VALUES
> (...);  -- This example updates only f3 and f4 if the record already exists.
>
> I just picked "NOT" as the separator, perhaps "KEEP" gives better
> clarity (see next example), but any good word would do.
> Primary key fields pk1 and pk2 along with specified non-updating fields
> f1 and f2 are all ignored during an update, but still used during an
> insert.
> Adding a PK field to the second set is a no-op as some might like it for
> legibility. i.e this next query is equivalent to the above:
>
> INSERT OR UPDATE INTO t (pk1, pk2, f1, f2, f3, f4) KEEP (pk1, pk2, f1,
> f2) VALUES (...);  -- This example updates only f3 and f4, same as above.
>
>
>   B - Use a Marker of sorts for Non-Updating fields, Like this perhaps
> using the Exclamation mark:
>
> INSERT OR UPDATE INTO t (pk1, !pk2, !f1, !f2, f3, f4) VALUES (...);  --
> Again update only f3 and f4 if the record already exists.
>
> (Adding the marker to a PK field is a no-op).
> Escaping is not needed since a fieldname starting with the same marker
> will be in the list of field-names, no ambiguity, and in the case where
> a set of fields contain fields starting with both one and two markers
> (for which the programmer should be shot, but let's assume it possible)
> then the field can simply be enclosed in quotes as is the norm for
> disambiguation in SQLite. This next example has fields named !f and !!f:
>
> INSERT OR UPDATE INTO t (pk1, pk2, !"!f", !!f) VALUES (...);  -- Here
> updating only !!f if the record already exists.
>
>
> Personally, I'm partial to option A.
>
> I know it's a bit of work, but it seems less so than many of the other
> additions - perhaps let's first have another show-of-hands to see if
> this a real need, but it is asked for here more frequently than any
> other feature (to my perception at least).
>
>
> _______________________________________________
> sqlite-users mailing list
> [hidden email]://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: UPSERT

Keith Medcalf

If you have multiple candidate keys for a single row that match more than one row (or the alternate candidate keys match different rows), your application should explode immediately!  

There is no need to "decide" which row is the correct one to update, you are already in a fatal error situation and need to revisit your database design (it is probably insufficiently normalized) and already self-inconsistent and suffering update anomalies (or you are treating a pseudo-key as a candidate key, which for the purposes of UPDATE or INSERT it is not -- the rowid is a pseudo-key -- (one of/any of) the "other" candidate keys in the row is the true primary key).

CREATE TABLE foo
(
     id        integer primary key,
     foo_key   text not null unique,
     some_data blob
);


  SAVEPOINT UpdateFoo;
     UPDATE foo
        SET some_data = :some_data
      WHERE foo_key = :foo_key;
  INSERT OR
IGNORE INTO foo (foo_key, some_data)
     VALUES (:foo_key, :some_data);
     SELECT id
       FROM foo
      WHERE foo_key = :foo_key;
    RELEASE UpdateFoo;


---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume.


>-----Original Message-----
>From: sqlite-users [mailto:sqlite-users-
>[hidden email]] On Behalf Of Paul
>Sent: Monday, 19 March, 2018 11:08
>To: SQLite mailing list
>Subject: Re: [sqlite] UPSERT
>
>I would suggest using the PostgreSQL way:
>  https://www.postgresql.org/docs/9.5/static/sql-insert.html
>
> INSERT INTO ...
>   ON CONFLICT [(<column name>)] DO UPDATE
>   SET foo = ... , bar = ... ;
>
>This approach is really cool, because we can specify which key is
>more
>important and discard other conflicts as an error. For example, given
>the following table:
>
>CREATE TABLE foo(
>    id                INTEGER NOT NULL,
>    foo_key           TEXT NOT NULL,
>    some_data         TEXT,
>
>    PRIMARY KEY(id),
>    UNIQUE (foo_key)
>);
>
>INSERT INTO foo(id, foo_key, some_data) VALUES(1, "XXX", "...");
>INSERT INTO foo(id, foo_key, some_data) VALUES(2, "YYY", "...");
>
>If we are performing a query:
>
>
>INSERT INTO foo(id, foo_key, some_data) VALUES(1, "YYY", "...")
> <OR UPDATE ...>
>
>Which record should we update and what columns?
>
>Having the ability to specify a specific column on which the conflict
>is actually an acceptable event lets the developer to make a decision
>how to resolve it:
>
>INSERT INTO foo(id, foo_key, some_data) VALUES(1, "YYY", "...")
> <ON CONFLICT (foo_key) DO UPDATE SET some_data = "...">
>
>
>19 March 2018, 18:41:34, by "R Smith" <[hidden email]>:
>
>> On 2018/03/19 1:50 PM, Olivier Mascia wrote:
>> >
>> > I don't know what any 'standard' SQL defines about this.
>> > I know that FirebirdSQL (where I came from, before meeting
>SQLite) did/does it this way:
>> >
>> > UPDATE OR INSERT INTO
>> >     {tablename | viewname} [(<columns>)]
>> >     VALUES (<values>)
>> >     [MATCHING (<columns>)]
>> >     [RETURNING <values> [INTO <variables>]]
>>
>> Quite right, and the statement in MSSQL is even more convoluted,
>which,
>> if it was in SQLite like this, would require a dynamically created
>SQL
>> statement that is worse than simply computing an UPDATE and an
>INSERT -
>> which a previous poster already lamented.
>>
>> My suggestion for UPSERT would be the very simple already SQLite-
>like
>> syntax of:
>>
>> INSERT OR UPDATE INTO t (k1, k2, ... , kn,  f1, f2, ... , fn)
>> followed by the usual VALUES clause or SELECT query.
>>
>> Any record found to exist with the exact same value in the Primary
>Key
>> field(s) [ k1 .. kn ] has all other fields (that are NOT Primary
>Key
>> fields) updated to the new values, and if no such record is found,
>the
>> row simply gets inserted.  If the inserted row OR updated values
>cause
>> any other constraint to break, then FAIL hard, the same way (and
>> possibly with the same ON CONFLICT options) as any other single
>INSERT
>> or UPDATE would be subjected to.
>>
>> This is far better than INSERT OR REPLACE since there is no delete,
>and
>> no multiple-row delete on constraint violations.
>> It is simple in terms of converting any current INSERT OR REPLACE
>query
>> to an INSERT OR UPDATE query requires changing 1 word only.
>>
>> Triggers should fire for ON INSERT and ON UPDATE according to
>whatever
>> actually is required during the operation.
>>
>> Adding this has no backward compatibility to break, this did not
>exist
>> before and it is not schema-specific.
>>
>>
>> One possible added refinement might be an optional second field-
>group
>> that should be ignored over-and-above the PK fields during the
>UPDATE.
>> (During the INSERT of course all fields MUST be added).
>>
>> 2 ways this can be done easily:
>>
>>   A - Use a separate 2nd prototype group for Non-Updating fields,
>Like
>> this perhaps:
>>
>> INSERT OR UPDATE INTO t (pk1, pk2, f1, f2, f3, f4) NOT (f1, f2)
>VALUES
>> (...);  -- This example updates only f3 and f4 if the record
>already exists.
>>
>> I just picked "NOT" as the separator, perhaps "KEEP" gives better
>> clarity (see next example), but any good word would do.
>> Primary key fields pk1 and pk2 along with specified non-updating
>fields
>> f1 and f2 are all ignored during an update, but still used during
>an
>> insert.
>> Adding a PK field to the second set is a no-op as some might like
>it for
>> legibility. i.e this next query is equivalent to the above:
>>
>> INSERT OR UPDATE INTO t (pk1, pk2, f1, f2, f3, f4) KEEP (pk1, pk2,
>f1,
>> f2) VALUES (...);  -- This example updates only f3 and f4, same as
>above.
>>
>>
>>   B - Use a Marker of sorts for Non-Updating fields, Like this
>perhaps
>> using the Exclamation mark:
>>
>> INSERT OR UPDATE INTO t (pk1, !pk2, !f1, !f2, f3, f4) VALUES (...);
>--
>> Again update only f3 and f4 if the record already exists.
>>
>> (Adding the marker to a PK field is a no-op).
>> Escaping is not needed since a fieldname starting with the same
>marker
>> will be in the list of field-names, no ambiguity, and in the case
>where
>> a set of fields contain fields starting with both one and two
>markers
>> (for which the programmer should be shot, but let's assume it
>possible)
>> then the field can simply be enclosed in quotes as is the norm for
>> disambiguation in SQLite. This next example has fields named !f and
>!!f:
>>
>> INSERT OR UPDATE INTO t (pk1, pk2, !"!f", !!f) VALUES (...);  --
>Here
>> updating only !!f if the record already exists.
>>
>>
>> Personally, I'm partial to option A.
>>
>> I know it's a bit of work, but it seems less so than many of the
>other
>> additions - perhaps let's first have another show-of-hands to see
>if
>> this a real need, but it is asked for here more frequently than any
>> other feature (to my perception at least).
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-
>[hidden email]://mailinglists.sqlite.org/cgi-
>bin/mailman/listinfo/sqlite-users
>_______________________________________________
>sqlite-users mailing list
>[hidden email]
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users