The SQL standard defines four levels of transaction isolation. The most strict is Serializable, which is defined by the standard in a paragraph which says that any concurrent execution of a set of Serializable transactions is guaranteed to produce the same effect as running them one at a time in some order. The other three levels are defined in terms of phenomena, resulting from interaction between concurrent transactions, which must not occur at each level. The standard notes that due to the definition of Serializable, none of these phenomena are possible at that level. (This is hardly surprising -- if the effect of the transactions must be consistent with having been run one at a time, how could you see any phenomena caused by interactions?)
The phenomena which are prohibited at various levels are:
A transaction reads data written by a concurrent uncommitted transaction.
A transaction re-reads data it has previously read and finds that data has been modified by another transaction (that committed since the initial read).
A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed due to another recently-committed transaction.
The result of successfully committing a group of transactions is inconsistent with all possible orderings of running those transactions one at a time.
The SQL standard and LightDB-implemented transaction isolation levels are described in Table 14.1.
Table 14.1. Transaction Isolation Levels
Isolation Level | Dirty Read | Nonrepeatable Read | Phantom Read | Serialization Anomaly |
---|---|---|---|---|
Read uncommitted | Allowed, but not in LightDB | Possible | Possible | Possible |
Read committed | Not possible | Possible | Possible | Possible |
Repeatable read | Not possible | Not possible | Allowed, but not in LightDB | Possible |
Serializable | Not possible | Not possible | Not possible | Not possible |
In LightDB, you can request any of the four standard transaction isolation levels, but internally only three distinct isolation levels are implemented, i.e., LightDB's Read Uncommitted mode behaves like Read Committed. This is because it is the only sensible way to map the standard isolation levels to LightDB's multiversion concurrency control architecture.
The table also shows that LightDB's Repeatable Read implementation does not allow phantom reads. This is acceptable under the SQL standard because the standard specifies which anomalies must not occur at certain isolation levels; higher guarantees are acceptable. The behavior of the available isolation levels is detailed in the following subsections.
To set the transaction isolation level of a transaction, use the command SET TRANSACTION.
Some LightDB data types and functions have
special rules regarding transactional behavior. In particular, changes
made to a sequence (and therefore the counter of a
column declared using serial
) are immediately visible
to all other transactions and are not rolled back if the transaction
that made the changes aborts. See Section 10.16
and Section 9.1.4.
Read Committed is the default isolation
level in LightDB. When a transaction
uses this isolation level, a SELECT
query
(without a FOR UPDATE/SHARE
clause) sees only data
committed before the query began; it never sees either uncommitted
data or changes committed during query execution by concurrent
transactions. In effect, a SELECT
query sees
a snapshot of the database as of the instant the query begins to
run. However, SELECT
does see the effects
of previous updates executed within its own transaction, even
though they are not yet committed. Also note that two successive
SELECT
commands can see different data, even
though they are within a single transaction, if other transactions
commit changes after the first SELECT
starts and
before the second SELECT
starts.
UPDATE
, DELETE
, SELECT
FOR UPDATE
, and SELECT FOR SHARE
commands
behave the same as SELECT
in terms of searching for target rows: they will only find target rows
that were committed as of the command start time. However, such a target
row might have already been updated (or deleted or locked) by
another concurrent transaction by the time it is found. In this case, the
would-be updater will wait for the first updating transaction to commit or
roll back (if it is still in progress). If the first updater rolls back,
then its effects are negated and the second updater can proceed with
updating the originally found row. If the first updater commits, the
second updater will ignore the row if the first updater deleted it,
otherwise it will attempt to apply its operation to the updated version of
the row. The search condition of the command (the WHERE
clause) is
re-evaluated to see if the updated version of the row still matches the
search condition. If so, the second updater proceeds with its operation
using the updated version of the row. In the case of
SELECT FOR UPDATE
and SELECT FOR
SHARE
, this means it is the updated version of the row that is
locked and returned to the client.
INSERT
with an ON CONFLICT DO UPDATE
clause
behaves similarly. In Read Committed mode, each row proposed for insertion
will either insert or update. Unless there are unrelated errors, one of
those two outcomes is guaranteed. If a conflict originates in another
transaction whose effects are not yet visible to the INSERT
, the UPDATE
clause will affect that row,
even though possibly no version of that row is
conventionally visible to the command.
INSERT
with an ON CONFLICT DO
NOTHING
clause may have insertion not proceed for a row due to
the outcome of another transaction whose effects are not visible
to the INSERT
snapshot. Again, this is only
the case in Read Committed mode.
Because of the above rules, it is possible for an updating command to see an inconsistent snapshot: it can see the effects of concurrent updating commands on the same rows it is trying to update, but it does not see effects of those commands on other rows in the database. This behavior makes Read Committed mode unsuitable for commands that involve complex search conditions; however, it is just right for simpler cases. For example, consider updating bank balances with transactions like:
BEGIN; UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345; UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 7534; COMMIT;
If two such transactions concurrently try to change the balance of account 12345, we clearly want the second transaction to start with the updated version of the account's row. Because each command is affecting only a predetermined row, letting it see the updated version of the row does not create any troublesome inconsistency.
More complex usage can produce undesirable results in Read Committed
mode. For example, consider a DELETE
command
operating on data that is being both added and removed from its
restriction criteria by another command, e.g., assume
website
is a two-row table with
website.hits
equaling 9
and
10
:
BEGIN; UPDATE website SET hits = hits + 1; -- run from another session: DELETE FROM website WHERE hits = 10; COMMIT;
The DELETE
will have no effect even though
there is a website.hits = 10
row before and
after the UPDATE
. This occurs because the
pre-update row value 9
is skipped, and when the
UPDATE
completes and DELETE
obtains a lock, the new row value is no longer 10
but
11
, which no longer matches the criteria.
Because Read Committed mode starts each command with a new snapshot that includes all transactions committed up to that instant, subsequent commands in the same transaction will see the effects of the committed concurrent transaction in any case. The point at issue above is whether or not a single command sees an absolutely consistent view of the database.
The partial transaction isolation provided by Read Committed mode is adequate for many applications, and this mode is fast and simple to use; however, it is not sufficient for all cases. Applications that do complex queries and updates might require a more rigorously consistent view of the database than Read Committed mode provides.
The Repeatable Read isolation level only sees data committed before the transaction began; it never sees either uncommitted data or changes committed during transaction execution by concurrent transactions. (However, the query does see the effects of previous updates executed within its own transaction, even though they are not yet committed.) This is a stronger guarantee than is required by the SQL standard for this isolation level, and prevents all of the phenomena described in Table 14.1 except for serialization anomalies. As mentioned above, this is specifically allowed by the standard, which only describes the minimum protections each isolation level must provide.
This level is different from Read Committed in that a query in a
repeatable read transaction sees a snapshot as of the start of the
first non-transaction-control statement in the
transaction, not as of the start
of the current statement within the transaction. Thus, successive
SELECT
commands within a single
transaction see the same data, i.e., they do not see changes made by
other transactions that committed after their own transaction started.
Applications using this level must be prepared to retry transactions due to serialization failures.
UPDATE
, DELETE
, SELECT
FOR UPDATE
, and SELECT FOR SHARE
commands
behave the same as SELECT
in terms of searching for target rows: they will only find target rows
that were committed as of the transaction start time. However, such a
target row might have already been updated (or deleted or locked) by
another concurrent transaction by the time it is found. In this case, the
repeatable read transaction will wait for the first updating transaction to commit or
roll back (if it is still in progress). If the first updater rolls back,
then its effects are negated and the repeatable read transaction can proceed
with updating the originally found row. But if the first updater commits
(and actually updated or deleted the row, not just locked it)
then the repeatable read transaction will be rolled back with the message
ERROR: could not serialize access due to concurrent update
because a repeatable read transaction cannot modify or lock rows changed by other transactions after the repeatable read transaction began.
When an application receives this error message, it should abort the current transaction and retry the whole transaction from the beginning. The second time through, the transaction will see the previously-committed change as part of its initial view of the database, so there is no logical conflict in using the new version of the row as the starting point for the new transaction's update.
Note that only updating transactions might need to be retried; read-only transactions will never have serialization conflicts.
The Repeatable Read mode provides a rigorous guarantee that each transaction sees a completely stable view of the database. However, this view will not necessarily always be consistent with some serial (one at a time) execution of concurrent transactions of the same level. For example, even a read only transaction at this level may see a control record updated to show that a batch has been completed but not see one of the detail records which is logically part of the batch because it read an earlier revision of the control record. Attempts to enforce business rules by transactions running at this isolation level are not likely to work correctly without careful use of explicit locks to block conflicting transactions.
The Repeatable Read isolation level is implemented using a technique known in academic database literature and in some other database products as Snapshot Isolation. Differences in behavior and performance may be observed when compared with systems that use a traditional locking technique that reduces concurrency. Some other systems may even offer Repeatable Read and Snapshot Isolation as distinct isolation levels with different behavior. The permitted phenomena that distinguish the two techniques were not formalized by database researchers until after the SQL standard was developed, and are outside the scope of this manual. For a full treatment, please see [berenson95].
For now, request for the Serializable transaction isolation level provided exactly the same behavior described here.
The behavior of Serializable isolation level is exactly same as Repeatable Read isolation level in LightDB for now.