A major problem with updating statistics in large tables in SQL Server was that the entire table always had to be sampled, even if only recent data had changed. This was also true when partitioning was being used: even if only the newest partition had changed since the last time statistics were updated, updating statistics again required sampling the entire table, including all the partitions that haven’t changed. Incremental statistics, a new SQL Server 2014 feature, can help with this problem.
Using incremental statistics, you can update only the statistics on the partition or partitions that have been modified, and the information on these partitions will be merged with the existing information to create the final statistics object. Another advantage of incremental statistics is that the percentage of data changes required to trigger an automatic update of statistics now works at the partition level, which basically means that now only 20 percent of rows changed (changes on the leading statistics column) per partition is required. Unfortunately, the histogram is still limited to 200 steps for the entire statistics object in this version of SQL Server.
Let’s look at an example of how can you update statistics at a partition level to explore its behavior. First, we need to create a partitioned table using the AdventureWorks2012 database:
For details about partitioning and the CREATE PARTITION FUNCTION / SCHEME statements, refer to “Partitioned Tables and Indexes” in Books Online.
We currently have data to populate 12 partitions. Let’s start by first populating only 11:
If required, you can use the following statement to inspect the contents of the partitions:
Let’s create an incremental statistics object using the CREATE STATISTICS statement with the new INCREMENTAL clause set to ON (OFF is the default):
You can also create incremental statistics while creating an index using the new STATISTICS_INCREMENTAL clause of the CREATE INDEX statement. You can inspect the created statistics object using the following query:
DBCC SHOW_STATISTICS(’dbo.TransactionHistory’, incrstats)
Among other things, you will notice that the created histogram has 200 steps (only the last three shown here):
So we already have the maximum number of steps in a statistics object. What would happen if we add data to a new partition? Let’s add data to partition 12:
Now, we update the statistics object using the following statement:
Note the new syntax specifying the partition, where you can specify multiple partitions, separated by commas. The UPDATE STATISTICS statement reads the specified partitions and then merges their results with the existing statistics object to build the global statistics. Note the RESAMPLE clause; this is required because partition statistics objects need to have the same sample rates to be merged to build the global statistics objects. Although only the specified partition was scanned, you can see that SQL Server has rearranged the histogram. The last three steps now show data for the added partition. You can also compare the original with the new histogram for other minor differences:
If you want to disable the incremental statistics object for any reason, you can use the following statement to go back to the original behavior (or optionally just drop the statistics object and create a new one):
After disabling the incremental statistics, trying to update a partition, as shown previously, will return the following error message:
Finally, you can also enable incremental statistics for your automatic statistics at the database level, if needed. This requires the INCREMENTAL = ON clause in the ALTER DATABASE statement, and obviously also requires AUTO_CREATE_STATISTICS set to ON.
To clean up the objects created for this exercise, run the following statements: