SQL Server Index Hints–Caution Advised!

on November 12, 2012


Index Hints: Be Careful!

SQL Server 2005 gave us Dynamic Managed Views (DMVs) that finally allowed queries to look for missing indexes, indexes not used, and rarely used indexes. SQL Server 2008 now suggests indexes (when it would help the execution plan cost) when specifying an actual query plan while executing a query.

Consider the following query on the AdventureWorks2012 database:

SELECT    sod.SalesOrderID
      ,SalesOrderDetailID
       ,CarrierTrackingNumber
        ,OrderQty
     ,ProductID
        ,SpecialOfferID
       ,sod.rowguid
      ,sod.ModifiedDate
     ,soh.OrderDate
        ,soh.ShipDate
     ,soh.TotalDue
 FROM Sales.SalesOrderDetail sod
   JOIN Sales.SalesOrderHeader soh
     ON soh.SalesOrderID = sod.SalesOrderID
  WHERE sod.ModifiedDate > '20080101'

The execution plan tab gives the following:

Missing Index Execution Plan Before

If you right click on the header you can select Missing Index Details which gives following in a new query window:

/*
 * Missing Index Details from SQLQuery1.sql - Server.AdventureWorks2012 (ServerUser (54))
 * The Query Processor estimates that implementing the following index could improve the query cost by 34.1961%.
 */
USE AdventureWorks2012
GO
CREATE NONCLUSTERED INDEX <Name of Missing Index, sysname,>
ON Sales.SalesOrderDetail (ModifiedDate)
INCLUDE (SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,rowguid)
GO

Why this hint might cause problems

Aha! Just fill in the index name and all is well. Maybe, but probably not! The SQL Server suggestion is to essentially create the entire table (all columns) again (in index form) only ordered by ModifiedDate. While this may be great for this specific query, it just goes against many established guidelines for indexes and query optimization. First consider that it includes the Primary Key columns SalesOrderID and SalesOrderDetailID. I’ve always been told that the primary key columns do not belong in the INCLUDE block, but let’s prove it. Consider the following 2 indexes:

CREATE NONCLUSTERED INDEX IDX_SalesOrderDetail_ModifiedDate ON Sales.SalesOrderDetail (ModifiedDate)
 INCLUDE (SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,
 SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,rowguid)
GO
CREATE NONCLUSTERED INDEX IDX_SalesOrderDetail_ModifiedDate2 ON Sales.SalesOrderDetail (ModifiedDate)
 INCLUDE (CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,
 LineTotal,rowguid)
GO

Running the following query:

SELECT id,name,dpages,reserved,used,rowcnt
  FROM sys.sysindexes
   WHERE [name] IN ('IDX_SalesOrderDetail_ModifiedDate','IDX_SalesOrderDetail_ModifiedDate2')

results in

Id Name DPages Reserved Used Rowcnt
1154103152 IDX_SalesOrderDetail_ModifiedDate 1456 1562 1474 121317
1154103152 IDX_SalesOrderDetail_ModifiedDate2 1455 1554 1473 121317

where dpages is the count of index pages used. Although not a large increase in index size, the Primary Key columns are redundant for the first index causing increased disk space and additional IO. It’s interesting that SQL Server always includes the Primary Keys in either the index definition and/or the INCLUDE clause. These column(s) are my first ones to remove.

Interestingly even with both indexes as defined above, when the original query was run again the execution plan was the same – a clustered index scan on dbo.SalesOrderDetail and not using either index.

So adding an index hint using IDX_SalesOrderDetail_ModifiedDate2 resulted in an index seek on this index, but changed to a hash join with an overall increased query cost of 2%.

Missing Index 2

For this specific case, the suggested index actually hurt query performance.

Another factor with the suggested index was that it included all the columns of the table. Just due to size, this is unnecessary. For this query, had the WHERE clause been more selective (say for a specific day), an index based just on ModifiedDate with an associated key lookup most likely would be the best query performer.

Final thoughts

When given a suggested index from SQL Server, please take it with a grain of salt. Actually I take it with a several grains of salt, a lime wedge, followed by a shot of tequila. Just be careful, too many indexes and bad index formulation can cause worse performance as well as increased maintenance overhead. Every scenario is different. As always, test, test, and test again!

Related Posts

Leave a Reply