From the Archives: Update Statements and Lower Iso...

JasonHinsperger · ‎03-11-2015

In this post, originally written by Glenn Paulley and posted to sybase.com in July of 2011, Glenn talks about the trade-offs when using different isolation levels and updating data in the database.

Routinely, application developers trade off serializable transaction semantics in favour of better execution time performance by limiting the potential for lock contention. Few and far between are applications that execute at ISO/ANSI SQL isolation level 3, SERIALIZABLE. Indeed, the SQL Anywhere default isolation level is zero - READ UNCOMMITTED - except for JDBC applications, where the default is READ COMMITTED.

At the READ UNCOMMITTED isolation level with SQL Anywhere, only schema locks and write row locks are acquired by a transaction during its operation; read row locks are never acquired, and so at READ UNCOMMITTED write transaction do not block read transactions. On the flip side, however, SQL Anywhere does not guarantee semantics at the READ UNCOMMITTED isolation level. To use the common parlance, you get what you pay for. With many applications, the risk and/or impact of uncommitted rows is low; sometimes this can lead to complacency about what READ UNCOMMITTED really means. In this post, I want to illustrate an example where the impact is more obvious.

Set-level UPDATE operations

In the SQL Anywhere Version 5.5 release, circa 1997, we introduced full support for set-level UPDATE statements that could modify columns that were part of a table's PRIMARY KEY, UNIQUE constraint, or part of a unique index, in support of the ISO SQL/1992 standard which was the current SQL standard at that time. To illustrate, suppose we have the following table:



CREATE TABLE updkey ( a INTEGER PRIMARY KEY, b INTEGER UNIQUE, c VARCHAR(500) )

populated by the following INSERT statement:



INSERT INTO updkey(a,b,c) SELECT row_num, row_num, 'test string' FROM rowgenerator WHERE row_num

In this example, we desire to renumber the "b" values of all ten rows using a single atomic statement. We can do so as follows:



UPDATE updkey SET b = 11-b, c = 'New value'

Processing the UPDATE statement row-by-row clearly won't do, since the update of any single row in the updkey table will immediately violate the uniqueness constraint on column "b". (Aside: you may be thinking the WAIT_ON_COMMIT connection option might help here, but WAIT_ON_COMMIT only affects referential integrity constraints, not uniqueness constraints). Consequently, Version 5.5 of SQL Anywhere provided a different mechanism to perform the update, and it has implications for lower levels of concurrency control, as we shall see.

The HOLD temporary table

When the SQL Anywhere server processes an UPDATE or MERGE statement and encounters a uniqueness constraint violation on a primary key, unique index, or unique constraint, the server automatically creates an unnamed "hold" temporary table to temporarily store the problematic rows. The temporary table contains both the before and after values of a row, so that AFTER row and AFTER statement triggers can work correctly. Processing the rows is done row-by-row as follows:

If the row can be modified without a uniqueness constraint violation, the update proceeds normally.
If the modification causes a uniqueness constraint violation, then
1. the row's contents, along with its new values, are copied to the hold temporary table;
2. the row, along with its index entries, is - temporarily - deleted from the base table. No DELETE triggers are fired for this temporary deletion.
any appropriate AFTER row triggers are fired for this row.

Once all of the rows have been processed, any deleted rows that have been copied to the hold temporary table are then re-inserted into the base table, with the modified values from the UPDATE or MERGE statement. The order in which the rows from the hold temporary table are processed is not guaranteed. If the re-insertion of any of the saved rows still causes a uniqueness violation, then the entire UPDATE or MERGE statement is rolled back, and the uniqueness constraint violation is reported back to the application.

Only if all row modifications are successful are any AFTER statement triggers fired for the request.

Implications

The effect of deleting rows during the execution of an INSERT or MERGE statement can impact the results of

an SQL statement that queries the same table, issued within an AFTER row trigger that is fired for the UPDATE or MERGE statement that initiated the action; or
any other connection, including event handlers, that are not executing at the SERIALIZABLE or SNAPSHOT isolation levels.

The semantics of this processing of set-level update operations is somewhat counter-intuitive, since on the surface you might expect that another connection concurrently querying the table would either "see" the old row values, or the new row values. However, with set-level update operations on tables with uniqueness constraints, there is the possibility that other connections will not see a particular row at all, depending on the isolation level being used. If the other connection is executing at the SERIALIZABLE isolation level, it will block until the transaction doing the update issues a COMMIT or ROLLBACK. If the other connection is executing at SNAPSHOT isolation, that transaction will continue to see the original values of the modified rows for the duration of that transaction.

This detailed, complex behaviour has previously been undocumented. It will appear in the standard documentation in the next major release of SQL Anywhere.