SQL Server Internals: Snapshot Isolation – Part 1

on May 21, 2018


https://logicalread.com/sql-server-internals-snapshot-isolation-part-2/

SQL Server has been around for over 30 years, and I’ve been working with it for almost as long. I’ve seen a lot of changes over the years (and decades!) and versions of this incredible product. In these posts, I’ll share with you how I look at some of the features or aspects of SQL Server®, sometimes along with a bit of historical perspective.

The ANSI SQL Standard defines four transaction isolation levels that you can read about here: https://en.wikipedia.org/wiki/Isolation_(database_systems). All versions of SQL Server starting with Version 7 support all four of the levels. However, SQL Server 2005 introduced a new level called SNAPSHOT isolation. The word ‘snapshot’ refers to the fact that the database system behaves ‘as if’ we have a snapshot (or picture) of the data as it was at some point in the past. This snapshot of previous data is implemented by keeping track of older versions of rows that have been updated or deleted. This implementation is sometimes called ‘row versioning,’ or multi-version concurrency control (MVCC).

Another term frequently used with snapshot isolation is ‘optimistic’ concurrency. Basically, optimistic concurrency just means that the data consistency in a multi-user system is managed without using locking and blocking. This is in contrast to pessimistic concurrency, which manages concurrent access to the same data using locking and blocking. Some people think MVCC is synonymous with optimistic concurrency, or that snapshot isolation is synonymous with one or both of these terms. Although these terms are usually related, they are not the same thing.

Snapshot isolation describes a behavior, and MVCC is one way that behavior can be implemented. Optimistic concurrency is also an implementation. You can think of MVCC as a way to implement data modification operations, and optimistic concurrency as a way to implement multi-user access to the same data. Since dealing with all three of these topics in detail would make for a book instead of a blog post, I’m only going to focus on snapshot isolation, but I may use the other terms to help me describe it. In this post, I’m going to tell you about the behaviors of snapshot isolation, and in part two, I’ll tell you about the implementation.

SQL Server 2005 gave us two flavors of snapshot isolation. There is a non-ANSI isolation level (enabled with SET TRANSACTION ISOLATION LEVEL), and a variation of the default isolation level called READ COMMITTED SNAPSHOT.

Remember when I said ‘snapshot’ referred to seeing data as it was at some point in the past? The difference between these two snapshot isolation possibilities is what point in the past is relevant.

I’ll start with READ COMMITTED SNAPSHOT. SQL Server default isolation level is READ COMMITTED, which basically means that you will only read data that has been committed, and you can not see any data that is in the process of being modified. By default, SQL Server locks the data while a transaction that updated it is in progress. No other process can see that locked data. They will be blocked. But, if you enable READ COMMITTED SNAPSHOT, you are guaranteeing that while no one else can see uncommitted values, other processes will not be blocked. Instead, another process or session will see a snapshot of the data. In this case, the session will see the most recent committed data as of the beginning of the statement the session is running. Let’s see an example.

I’ll use the AdventureWorks2016 database, but you could use AdventureWorks2014 if you like. First, I’ll enable the option for READ COMMITTED SNAPSHOT, which is a database-level option. You’ll need to make sure no other connections are using the database before you make this change.

ALTER DATABASE AdventureWorks2016 SET READ_COMMITTED_SNAPSHOT ON;

GO

USE AdventureWorks2016;

GO

Now start a transaction and don’t commit it. Inside this transaction, we can see the changed data, and all TaxAmt values should show as 0.

BEGIN TRAN

UPDATE Sales.SalesOrderHeader

SET TaxAmt = 0;


SELECT SalesOrderID, OrderDate, SubTotal, TaxAmt, Freight, TotalDue


FROM Sales.SalesOrderHeader;

GO

Start a second session in the same database and try to look at the data in the SalesOrderHeader table. Using SQL Server’s default behavior, this second session would block, but because the database is in READ COMMITTED SNAPSHOT isolation, there is no blocking.

USE AdventureWorks2016;

GO

SELECT SalesOrderID, OrderDate, SubTotal, TaxAmt, Freight, TotalDue

FROM Sales.SalesOrderHeader;


GO

In this second session, we’ll see the original (snapshot) data. It’s the most recent committed data in the table. READ COMMITTED SNAPSHOT is considered a statement-level snapshot, which means each statement will return the most recent committed values as of the start of that statement. This means we can get different values returned in the same transaction.

Let’s see a slightly more involved example. I’ll indicate two different sessions’ operations concurrently in a chart, so you can see how the operations are interleaved. I’m assuming both sessions are in the AdventureWorks database (either 2014 or 2016) and that the database has the READ COMMITTED SNAPSHOT option set to ON.

 

Step # Session 1 Session 2
1 BEGIN TRAN
2 UPDATE Sales.SalesOrderHeader

SET TaxAmt = 0;

3 BEGIN TRAN
4 SELECT SalesOrderID, OrderDate, SubTotal, TaxAmt, Freight, TotalDue

FROM Sales.SalesOrderHeader;

—  Session 2 will see the original values for TaxAmt.

5 COMMIT TRAN
6 SELECT SalesOrderID, OrderDate, SubTotal, TaxAmt, Freight, TotalDue

FROM Sales.SalesOrderHeader;

 

—  Session 2 will now see the new values for
—  TaxAmt because at the time this second
—  SELECT started, the new values were
—  already committed.

7 COMMIT TRAN

 

Note the comment in Step 6. The second session’s transaction will read different values than it read the first time it ran the same SELECT operation.

How is the isolation level called SNAPSHOT different from READ COMMITTED SNAPSHOT? First, enabling this option is different. At the database level, we just allow SNAPSHOT isolation to be used, but then the session must set the transaction isolation level. Second, the isolation level gives us a transaction-based snapshot, so a second session will see the most recent data as of the beginning of the transaction (not the statement). I’ll execute the exact same transactions as above, but this time, I’ll be in the full isolation level SNAPSHOT.

Before you start, allow SNAPSHOT isolation in this database and turn off READ COMMITTED SNAPSHOT. Make sure there are no open transactions in the database.

ALTER DATABASE AdventureWorks2016 SET READ_COMMITTED_SNAPSHOT OFF;

GO

ALTER DATABASE AdventureWorks2016 SET ALLOW_SNAPSHOT_ISOLATION ON;
GO

 

Step # Session 1 Session 2
1 BEGIN TRAN
2 UPDATE Sales.SalesOrderHeader

SET TaxAmt = 0;

3 ALTER DATABASE AdventureWorks2016

SET ALLOW_SNAPSHOT_ISOLATION ON;

4 BEGIN TRAN
5 SELECT SalesOrderID, OrderDate, SubTotal, TaxAmt, Freight, TotalDue

FROM Sales.SalesOrderHeader;

 

—  Session 2 will see the original values for TaxAmt

6 COMMIT TRAN
7 SELECT SalesOrderID, OrderDate, SubTotal, TaxAmt, Freight, TotalDue

FROM Sales.SalesOrderHeader;

—  Session 2 will still see the original values for
—    TaxAmt, since it is seeing the most recent values
—    as of the start of this transaction

8 COMMIT TRAN
9 SELECT SalesOrderID, OrderDate, SubTotal, TaxAmt, Freight, TotalDue

FROM Sales.SalesOrderHeader;

 

— Only now will Session 2 see the new values
— for TaxAmt, since at the time this last
— SELECT (an implied transaction) started, the
— new values have been committed

 

SQL Server’s SNAPSHOT isolation and READ COMMITTED SNAPSHOT use MVCC to manage concurrency, but they are not completely optimistic. No locks are taken by session 2’s transaction in the examples above because they are only performing SELECT operations. If session 2 was modifying data with INSERT, UPDATE, or DELETE, locks would be taken and blocking could occur. (One of these days, I may devote another post to looking at data modification under SNAPSHOT isolation.) A truly optimistic concurrency implementation would take no locks at all when performing INSERT, UPDATE, DELETE, or SELECT.

One confusing point about SNAPSHOT isolation has to do with where the transaction actually starts. I’ve said that SNAPSHOT isolation will return the most recently committed data as of the beginning of the transaction, but where does the transaction start? It does not start at the BEGIN TRAN! For the purpose of SNAPSHOT isolation, a transaction starts at the first statement that accesses any data AFTER the BEGIN TRAN. So in the examples above, if Session 1 executed COMMIT TRAN right after Step 4, and before Step 5, Session 2 would see the new data, not the previous values. I’ll leave it to you to test that out and prove to yourself that it’s true.

You can see what SNAPSHOT state each database is in with the following query.

SELECT name, is_read_committed_snapshot_on, snapshot_isolation_state_desc

FROM sys.databases;


GO

The column is_read_committed_snapshot_on has only two possible values. A value of 1 means ON or ‘yes,’ and a value of 0 means OFF or ‘no.’ However, the column snapshot_isolation_state_desc can have one of four values: ON, OFF, IN_TRANSITION_TO_ON, and IN_TRANSITION_TO_OFF. If there are open transactions in a database when you attempt to change the state, the change will be a deferred operation. If you try to set the value of allow_snapshot_isolation to ON when there are open transactions in the database, the state of snapshot_isolation_state_desc will be IN_TRANSITION_TO_ON until all of those active transactions are terminated. Conversely, if you try to set the value of allow_snapshot_isolation to OFF when there are open transactions in the database, the state of snapshot_isolation_state_desc will be IN_TRANSITION_TO_OFF until all of those active transactions are terminated. Again, I’ll leave it to you to test that out for yourself.

Next time, we’ll look at how SNAPSHOT isolation is implemented. How does SQL Server know what the previous committed values are, and how does it know when to use them?

Related Posts