Index Hints: Be Careful!
SQL Server 2005 gave us Dynamic Managed Views (DMVs) that finally allowed queries to look for missing indexes, indexes not used, and rarely used indexes. SQL Server 2008 now suggests indexes (when it would help the execution plan cost) when specifying an actual query plan while executing a query.
Consider the following query on the AdventureWorks2012 database:
SELECT sod.SalesOrderID ,SalesOrderDetailID ,CarrierTrackingNumber ,OrderQty ,ProductID ,SpecialOfferID ,sod.rowguid ,sod.ModifiedDate ,soh.OrderDate ,soh.ShipDate ,soh.TotalDue FROM Sales.SalesOrderDetail sod JOIN Sales.SalesOrderHeader soh ON soh.SalesOrderID = sod.SalesOrderID WHERE sod.ModifiedDate > '20080101'
The execution plan tab gives the following:
If you right click on the header you can select Missing Index Details which gives following in a new query window:
/* * Missing Index Details from SQLQuery1.sql - Server.AdventureWorks2012 (ServerUser (54)) * The Query Processor estimates that implementing the following index could improve the query cost by 34.1961%. */ USE AdventureWorks2012 GO CREATE NONCLUSTERED INDEX <Name of Missing Index, sysname,> ON Sales.SalesOrderDetail (ModifiedDate) INCLUDE (SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,rowguid) GO
Why this hint might cause problems
Aha! Just fill in the index name and all is well. Maybe, but probably not! The SQL Server suggestion is to essentially create the entire table (all columns) again (in index form) only ordered by ModifiedDate. While this may be great for this specific query, it just goes against many established guidelines for indexes and query optimization. First consider that it includes the Primary Key columns SalesOrderID and SalesOrderDetailID. I’ve always been told that the primary key columns do not belong in the INCLUDE block, but let’s prove it. Consider the following 2 indexes:
CREATE NONCLUSTERED INDEX IDX_SalesOrderDetail_ModifiedDate ON Sales.SalesOrderDetail (ModifiedDate) INCLUDE (SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID, SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,rowguid) GO CREATE NONCLUSTERED INDEX IDX_SalesOrderDetail_ModifiedDate2 ON Sales.SalesOrderDetail (ModifiedDate) INCLUDE (CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount, LineTotal,rowguid) GO
Running the following query:
SELECT id,name,dpages,reserved,used,rowcnt FROM sys.sysindexes WHERE [name] IN ('IDX_SalesOrderDetail_ModifiedDate','IDX_SalesOrderDetail_ModifiedDate2')
results in
Id | Name | DPages | Reserved | Used | Rowcnt |
1154103152 | IDX_SalesOrderDetail_ModifiedDate | 1456 | 1562 | 1474 | 121317 |
1154103152 | IDX_SalesOrderDetail_ModifiedDate2 | 1455 | 1554 | 1473 | 121317 |
where dpages is the count of index pages used. Although not a large increase in index size, the Primary Key columns are redundant for the first index causing increased disk space and additional IO. It’s interesting that SQL Server always includes the Primary Keys in either the index definition and/or the INCLUDE clause. These column(s) are my first ones to remove.
Interestingly even with both indexes as defined above, when the original query was run again the execution plan was the same – a clustered index scan on dbo.SalesOrderDetail and not using either index.
So adding an index hint using IDX_SalesOrderDetail_ModifiedDate2 resulted in an index seek on this index, but changed to a hash join with an overall increased query cost of 2%.
For this specific case, the suggested index actually hurt query performance.
Another factor with the suggested index was that it included all the columns of the table. Just due to size, this is unnecessary. For this query, had the WHERE clause been more selective (say for a specific day), an index based just on ModifiedDate with an associated key lookup most likely would be the best query performer.
Final thoughts
When given a suggested index from SQL Server, please take it with a grain of salt. Actually I take it with a several grains of salt, a lime wedge, followed by a shot of tequila. Just be careful, too many indexes and bad index formulation can cause worse performance as well as increased maintenance overhead. Every scenario is different. As always, test, test, and test again!
Leave a Reply