SQL Server Internals: Peeking Inside the Transaction Log Structure, Part 2

on March 14, 2018


In part one of this topic, “Peeking Inside the Transaction Log Structure,” I told you about the organization of a SQL Server® database’s transaction log. Understanding this structure will help you keep your log from growing unmanageably large, and help you shrink it if the need arises. Ideally, you’ll find a good size for your log that allows you to perform your normal activities without needing to allocate more log space. (Determining the perfect size for the log is another whole topic, for another time.) In part two, I’m going to tell you what happens if your log ends up growing unexpectedly large, and you need to shrink it. Shrinking a transaction log is one of the more problematic operations you might need to undertake, but it can be a lot less problematic once you understand Virtual Log Files and how they are used.

Keep in mind that shrinking your log should not be considered a normal maintenance operation. Your log should be big enough to handle whatever records are generated for your normal workloads. If a once-in-in-a-blue-moon workload is run that increases the log size, you might consider shrinking it afterwards. There are also problematic situations that can cause the log to grow inappropriately. For example:

  • Your log backups are failing (perhaps because their destination location is full), so the log is never truncated and your VLFs are not reusable
  • An application is buggy and does not properly terminate its transactions, causing large portions of the log to stay active and unaffected by the truncations

You can check the documentation for the sys.databases catalog view, and look at a column called
log-reuse-wait.
You’ll see all the reasons why SQL Server might not be able to reuse the log and cause it to just continue growing.  A value of 2 indicates that a log backup needs to be done, and a value of 4 indicates that there is an open transaction in the database. There are lots of reasons, as you can see in documentation at https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-databases-transact-sql.

When something unexpected happens and the log grows too big, you may need to shrink it.  Shrinking is a physical operation that removes entire VLFs, starting from the physical end of the log. Only reusable or unused VLFs can be removed. The biggest problem with trying to shrink a transaction log occurs when the active log is near the end of the physical file. There is always at least one active VLF, where the current transactions are being written. What if you have 100 VLFs and they’re all reusable except for the active one, which is the 98th VLF in the file? SQL Server will only be able to remove two VLFs.

When shrinking the log, the first step is to make sure as many VLFs as possible are reusable by performing a truncation. You can truncate the log either by backing it up or by switching the database to SIMPLE recovery model. SIMPLE recovery model is primarily intended for databases that will not be backed up regularly, because in this model, the database’s transaction log will be automatically truncated every time a CHECKPOINT occurs, which is usually often. In addition, the operation of making the switch to SIMPLE recovery will perform a log truncation. Since shrinking the log is not considered a regular maintenance activity, you might be able to find an off-hours time to make this change. If your database has no off-hours, you may not be able to switch to SIMPLE recovery, and there will be some additional considerations, which I’ll describe shortly.

The SHRINK command itself is very straightforward. You’ll need to verify the file ID for the log file (usually it’s 2, but look in sys.database_files to be sure) and then issue the command: DBCC SHRINKFILE(2, N), where N is the number of megabytes that you would like to shrink your log to.

If you are able to switch your database to SIMPLE recovery before shrinking, a side effect of the SHRINK command is that SQL Server will move the active log to earlier in the physical file when possible (if there are reusable VLFs earlier in the file), so that more of the reusable VLFs will be at the end and can be removed. Once the shrink has completed, you’ll need to switch the database back to the original recovery model, then perform a full database backup to start the backup chain again.

If you choose not to switch your database to SIMPLE recovery model, you’ll need to make sure the log is truncated before the shrink by performing a log backup. In this case, the active VLF is not automatically moved when you issue the SHRINK command. You may have to issue multiple log backup commands and multiple shrink commands to move the active log to an earlier point in the log. In SQL Server 2016 and later, you can examine where the active log is with the function sys.dm_db_log_info, which I mentioned in part one.

Here’s some code to watch the shrinking happen. I am running SQL Server 2017. My results will be shown using Microsoft’s SQLServer Management Studio 17.5. The amount of growth may be different if you have different default options set for your log file, or are using an earlier version. I am creating a new database, creating a table, inserting a few rows, and updating those rows many times to cause the log to grow without growing the database.

USE master;
DROP DATABASE IF EXISTS newdb;
GO
CREATE DATABASE newdb;
GO
USE newdb;
ALTER DATABASE newdb SET TARGET_RECOVERY_TIME = 0 seconds;
CHECKPOINT;
-- First look at the VLFs for the newdb database
SELECT * FROM sys.dm_db_log_info(default)
-- Now verify that newdb is in auto truncate mode
SELECT last_log_backup_lsn
FROM master.sys.database_recovery_status
WHERE database_id = db_id('newdb');
GO

My newdb database shows four VLFs, and only the first one has been used. The view sys.database_recovery_status shows a NULL in the last_log_backup_lsn column for databases that are automatically truncating the log at every checkpoint.  Because I have never performed a full database backup on this new database, my log will be truncated as if it is in SIMPLE recovery model, no matter what sys.databases reports.

Now I’ll create a new table, add 3 rows, and update them 5000 times.

SET NOCOUNT ON;
CREATE TABLE newtable (a int);
GO
INSERT INTO newtable VALUES (10);
INSERT INTO newtable VALUES (20);
INSERT INTO newtable VALUES (30);
GO
SET NOCOUNT ON
DECLARE @counter int;
SET @counter = 1 ;
WHILE @counter < 5000 BEGIN
UPDATE newtable SET a = a + 1;
SET @counter = @counter + 1;
END;
GO
SELECT * FROM sys.dm_db_log_info(default);
GO

My newdb log shows the same four VLFs, but now the first three are active.

I now issue the SHRINK command:

DBCC SHRINKFILE (2, 2);
GO
SELECT * FROM sys.dm_db_log_info(default);
GO

I see that the log has been reduced in size, the active log has moved to the beginning of the file, and there is a new vlf_sequence_number for the active VLF.

If you want to see what happens when your database is not set to automatically truncate the log, you can add these commands before creating the new table. Change the path to any backup location.

ALTER DATABASE newdb SET RECOVERY FULL;
BACKUP DATABASE newdb to disk =
'c:\Backups\newdb.bak';
--As soon as you make the full backup, you can verify that the database is not in auto truncate mode by looking at the database_recovery_status view.
SELECT last_log_backup_lsn
FROM master.sys.database_recovery_status
WHERE database_id = db_id('newdb');

Right before trying to shrink, you’ll need to back up the log. You might need to execute these statements several times before the log is shrunk to the desired size.

BACKUP LOG newdb to disk =
'c:\Backups\newdb_log.bak';
GO
DBCC SHRINKFILE (2, 2);
GO
SELECT * FROM sys.dm_db_log_info(default);
GO

As mentioned at the beginning, there are many reasons the log might not be shrinkable. If there are no open or unreplicated transactions, the most common reason is that the active log is too close to the end of the file.

To finish off this post, I’ll give you a little code snippet that I wrote to take advantage of the new DMV sys.dm_db_log_info. It reports on how much of the log is potentially shrinkable, based on where the highest vlf_sequence_number is. I create a temp table first that contains the number of VLFs and the total size of the log, and use that temp table to help determine where the highest numbered VLF is. The result shows us the percentage of VLFs that followed the highest numbered one, and the percentage of the total log size that is potentially shrinkable.

DROP TABLE IF EXISTS #log_facts;
GO
SELECT COUNT(*) as num_vlfs, SUM(vlf_size_mb) as total_size, MAX(vlf_sequence_number) as max_vlf
into #log_facts FROM sys.dm_db_log_info(null);
GO
SELECT count(*) * 100./(SELECT num_vlfs FROM #log_facts) as percentage_of_vlfs_shrinkable, SUM(vlf_size_mb)*100./(SELECT total_size FROM #log_facts) as percentage_of_size_shrinkable
FROM sys.dm_db_log_info(null)
WHERE vlf_begin_offset >
(SELECT vlf_begin_offset FROM sys.dm_db_log_info(null)
WHERE vlf_sequence_number = (SELECT max_vlf FROM #log_facts));
GO

Understanding a SQL Server database’s transaction log is a huge topic and I’ve barely skimmed the surface in these two posts. However, I do think that once you understand how VLFs are used and reused, you’ll be well on your way to being able to keep your log healthy and manageable.

Related Posts

Leave a Reply