SQL Server Internals: Snapshot Isolation – Part 2

on June 7, 2018


In Part 1 of this topic (“SQL Server Internals: Snapshot Isolation”) I told you about the two different ways to use snapshot isolation, either with an alternative implementation of the default READ COMMITTED Isolation Level (called READ COMMITTED SNAPSHOT, or RCSI) or by using a whole new Isolation Level called SNAPSHOT. RCSI is a statement level snapshot, meaning any read operation will see the most recent committed data as of the beginning of the statement, and SNAPSHOT Isolation is a transaction level snapshot, meaning any reads will see the most recent committed data as of the beginning of the transaction. In this Part 2, I’ll tell you how SQL Server® keeps track of what the most recent committed values are.

 

As soon as a database is enabled for either ALLOW_SNAPSHOT_ISOLATION or READ_COMMITTED_SNAPSHOT, all UPDATE and DELETE operations starts saving the previous ‘version’ of the row in a part of the tempdb database called the ‘version store.’ The versions look exactly like any other data row, except they have 14 extra bytes that keep track of two crucial pieces of information.

First, the row keeps track of an internal transaction ID number (XSN) of the transaction that performed the UPDATE or DELETE. Each transaction is assigned a monotonically increasing XSN value when it starts a snapshot read or when it writes data in a snapshot-enabled database. The XSN is reset to 0 when your SQL Server instance is restarted. Transactions that don’t generate version rows and don’t read snapshot data don’t receive an XSN.

Second, the row contains a pointer to the previous version of the same row. So, here’s an example. Let’s say a row with columns A and B is inserted by Transaction T1 and has values A=1 and B=5. Then the row is updated by Transaction T2 so that column B is now 9. Transaction T3 updates the row yet again and column B is now 11. The figure below illustrates the current row and the two previous versions.

 

The current row is in the data page where SQL Server will look for the row. But if the transaction or statement reading the data started before T3 started, it will not be able to read the current values. It will have to follow the pointers to the most recent version of the row that was created before it started. So let’s be a bit more specific.

Example 1: A statement running in a database with READ_COMMITTED_SNAPSHOT enabled starts a SELECT and gets an XSN value of 40. After the SELECT starts, at XSN 50, the row described above is updated to B=11. However, whenever this SELECT needs to read the row, it will see that the XSN of 50 is after it started, and will follow the chain back to the previous row that was modified at XSN 30. Since 30 is before this SELECT started, it will be readable by the SELECT. The SELECT will return B=9 for this row.

Example 2: A transaction running under SET TRANSACTION ISOLATION LEVEL SNAPSHOT begins and gets an XSN value of 20. While it is running, the row above is updated twice: once at XSN 30 and once at XSN 50. But the transaction still knows it started at XSN 20. So when this transaction reads the row above, it first goes to the data page and sees the current row has XSN 50, which is too big. It follows the chain back and sees a row with XSN 30, which is also too big. Only the third row in the chain has an XSN that is earlier than this running transaction’s XSN, so that is the row that is read, and will continue to be read until the transaction that started at XSN 20 is completed.

Note that if a transaction started at XSN 7, it would not be able to see this row at all, since the row wasn’t originally inserted until XSN 10. There is no version of the row with an XSN smaller than 7.

As mentioned, the two previous versions of the row are stored in the tempdb database. We call the storage of these rows the ‘version store,’ but there is no actual separate area. Previous versions of updated rows are just stored wherever there is room for them in tempdb. We cannot reserve space for the version store, but we can monitor how much space is being used. I talked about monitoring tempdb space usage, including the space used for the version store, using sys.dm_db_file_space_usage in a previous post. (https://logicalread.com/sql-server-internals-tempdb)

There are several very useful DMVs that let you see details about the version store. Let’s look at an example. The first DMV I’ll show you is sys.dm_tran_version_store, which contains information about the actual rows in the version store. Run the following script to make a copy of the Production.Product table, and then turn on ALLOW_SNAPSHOT_ISOLATION in either the AdventureWorks2014 or the AdventureWorks2016 database. Finally, verify that the ALLOW_SNAPSHOT_ISOLATION option is ON and that no rows are currently in the version store. You might need to close any active transactions currently using AdventureWorks2016.

USE AdventureWorks2016;

DROP TABLE IF EXISTS NewProduct;

SELECT * INTO NewProduct FROM Production.Product;

GO

ALTER DATABASE ADVENTUREWORKS2016

SET ALLOW_SNAPSHOT_ISOLATION ON;

GO

SELECT name, snapshot_isolation_state_desc, is_read_committed_snapshot_on

FROM sys.databases

WHERE name= 'AdventureWorks2016';

GO

SELECT COUNT(*) FROM sys.dm_tran_version_store;

GO

 

As soon as you see that the database option snapshot_isolation_state_desc is ON and no rows are in the version store, you can continue. Now, run this UPDATE statement on the NewProduct table and look at the version store again.

UPDATE NewProduct SET ListPrice = ListPrice * 1.1;

GO

SELECT COUNT(*) FROM sys.dm_tran_version_store;

GO

 

You should now see that 504 rows are in the version store because the NewProduct table has 504 rows. The previous version of each row, before the UPDATE, has been written to the version store in tempdb.

An important behavior to note here is that as soon as ALLOW_SNAPSHOT_ISOLATION is enabled, SQL Server starts storing row versions, even if no snapshot transactions need to read those versions.  In a heavily updated database, this can affect the behavior of other queries that use tempdb, as well as the server itself. To prevent your tempdb from filling up, you want to disable SNAPSHOT isolation.

 

ALTER DATABASE ADVENTUREWORKS2016

SET ALLOW_SNAPSHOT_ISOLATION OFF;

 

Row versioning gives SQL Server an optimistic concurrency model to work with when the default behavior results in too much locking and blocking. Before you switch to the row versioning-based isolation levels, you must carefully consider the tradeoffs of using this concurrency model. In addition to requiring extra management to monitor the increased use of tempdb for the version store, versioning slows the performance of update operations because of the extra work involved in maintaining old versions. Update operations bear this cost even if there are no current processes reading the data. Readers that use row versioning have the extra cost of traversing the linked row versions to find the appropriate version of the requested row.

Also, because the optimistic concurrency model of SNAPSHOT isolation assumes (optimistically) that not many update conflicts will occur, you shouldn’t choose the SNAPSHOT isolation level if you are expecting lots of contention for updating the same data concurrently. SNAPSHOT isolation works well to help ensure that readers are not blocked by writers, but simultaneous writers still aren’t allowed. In the default pessimistic model, the first writer blocks all subsequent writers, but when using Snapshot isolation, subsequent writers can receive error messages, and the application needs to resubmit the original request.

In addition to the DMV sys.dm_tran_version_store, a couple of other DMVs can also be quite useful. These include sys.dm_tran_transactions_snapshot, and sys.dm_tran_active_snapshot_database_transactions. These views contain the transaction_sequence_num column, which is the XSN mentioned earlier. You can look at the documentation to see what some of the other columns in these views mean, but I’ll tell you about two that I find very useful. The DMV sys.dm_tran_active_snapshot_database_transactions contains one row for each open transaction or currently running statement in a database enabled for one of the snapshot-based isolation levels. It has a column called elapsed_time_seconds that allows you to see if there are long running transactions that keep the version store from being cleaned up. In addition, the column called max_version_chain_traversed can give you an idea of how long your chains of versions are becoming. This indicates how much extra work your SELECT operations have to do to find the ‘most recent committed value’ to read.

The version store size is managed automatically, and SQL Server maintains a cleanup thread to ensure that versioned rows aren’t kept around longer than needed. For queries running under the full SNAPSHOT isolation, the row versions must be kept at least until the end of the transaction that generated them. In addition, if there are any other still running transactions that need to read a particular row, it can’t be removed. Finally, a row cannot be removed from a version chain if there are any older versions in the same chain that are still needed. For SELECT statements running under RCSI, a particular row version isn’t needed after the SELECT statement is executed and can be removed.

The regular cleanup function is performed every minute as a background process to reclaim all reusable space from the version store. If you check the version store on your SQL Server instance again, you should see that it probably now has 0 rows in it because the UPDATE that created the 504 rows we saw earlier has already committed (it was a single-statement transaction) and there are no other transactions that needed those rows. If, however, we ran that earlier UPDATE in an explicit transaction, as shown, the version store could not be cleared until we finally execute a COMMIT TRAN.

BEGIN TRAN

UPDATE NewProduct SET ListPrice = ListPrice * 1.1;

GO

SELECT COUNT(*) FROM sys.dm_tran_version_store;

GO

 

If tempdb actually runs out of free space, the cleanup function is called before SQL Server increases the size of the files. If the disk gets so full that the files can’t grow, SQL Server stops generating versions. If that happens, a snapshot query fails if it needs to read a version that wasn’t generated due to space constraints. Although a full discussion of troubleshooting and monitoring is beyond the scope of this post, note that SQL Server includes more than a dozen performance counters to monitor tempdb and the version store.

 

Being able to run your SELECT queries against a snapshot of your data can greatly reduce locking and blocking and improve overall throughput. However, keep in mind that it comes at a cost. You need to make sure that you have space in your tempdb for all the versions of all the rows that might be generated, and you need to make sure your tempdb disk system can handle the load. Also note that the snapshot-based isolation levels can actually slow down your SELECT operations if long version chains need to be searched. Only you can determine if the benefit of removing the blocking is worth it, by testing thoroughly with your own data.

 

Related Posts