Guidelines for Creating and Using SQL Server Indices

By: Dusan Petkovic


Although the SQL Server Database Engine does not have any practical limitations concerning the number of indices, it is advisable to limit them, for a couple of reasons. First, each index uses a certain amount of disk space, so it is possible that the total number of index pages could exceed the number of data pages within a database. Second, in contrast to the benefits of using an index for retrievals, inserts and updates have a direct impact on the maintenance of the index. The more indices on the tables, the more index reorganizations that are necessary. The rule of thumb is to choose indices wisely for frequent queries and evaluate index usage afterwards.

This article gives some recommendations for creating and using indices.

NOTE

The following recommendations are general rules of thumb. They ultimately depend on how your database will be used in production and which queries are used most frequently. An index on a column that is never used will be counterproductive.

Indices and Conditions in the WHERE Clause

If the WHERE clause in a SELECT statement contains a search condition with a single column, you should create an index on this column. The use of an index is especially recommended if the selectivity of the condition is high. The selectivity of a condition is defined as the ratio of the number of rows satisfying the condition to the total number of rows in the table. (High selectivity corresponds to a small ratio.) The most successful processing of a retrieval with the indexed column will be achieved if the selectivity of a condition is 5 percent or less.

The column should not be indexed if the selectivity of the condition is constantly 80 percent or more. In such a case, additional I/O operations will be needed for the existing index pages, which would eliminate any time savings gained by index access. In this particular case, a table scan would be faster, and the query optimizer would usually choose to use a table scan, rendering the index useless.

If a search condition in a frequently used query contains one or more AND operators, it is best to create a composite index that includes all the columns of the table specified in the WHERE clause of the SELECT statement. Example 1 shows the creation of a composite index that includes all the columns specified in the WHERE clause of the SELECT statement.

EXAMPLE 1

0323_001

0324_001

 

The AND operator in this query contains two conditions. As such, both of the columns appearing in each condition should be indexed using a composite nonclustered index.

Indices and the Join Operator

In the case of a join operation, it is recommended that you index each join column. Join columns often represent the primary key of one table and the corresponding foreign key of the other or the same table. If you specify the PRIMARY KEY and FOREIGN KEY integrity constraints for the corresponding join columns, only a nonclustered index for the column with the foreign key should be created, because the system will implicitly create the clustered index for the PRIMARY KEY column.

Example 2 shows the creation of indices, which should be used if you have a query with a join operation and an additional filter.

EXAMPLE 2

0324_002

 

For Example 2, the creation of two separate indices for the emp_no column in both the employee and works_on tables is recommended. Also, an additional index should be created for the enter_date column.

Covering Index

As you already know, significant performance gains can be achieved when all columns in the query are included in the index. Example 3 shows a covering index.

EXAMPLE 3

0324_003

0325_001

 

Example 3 first drops the IX_Address_StateProvinceID index of the Address table. In the second step, it creates the new index, which additionally includes two other columns, on the PostalCode column. Finally, the SELECT statement at the end of the example shows a query covered by the index. For this query, the system does not have to search for data in data pages, because the optimizer can find all the column values in the leaf pages of the nonclustered index.

NOTE

The use of covering indices is recommended because index pages generally contain many more entries than the corresponding data pages contain. Also, to use this method, the filtered columns must be the first key columns in the index.

Leave a Reply