Good index design starts with a good understanding of the benefits indexes provide. In books, table of contents help readers locate a section, chapter, or page of interest. SQL Server indexes serve the same function as a table of contents in a book. It enables SQL Server to locate and retrieve the data requested in a query as fast as possible.
Consider a 500-page book with dozens of sections and chapters and no table of contents. To locate a section of a book, readers would need to flip and read through every page until they locate the section of interest. Imagine if you have to do this for multiple sections of the book. It would be a time-consuming task.
This analogy also applies to SQL Server database tables. Without proper indexes, SQL Server has to scan through all the data pages that contain the data in a table. For tables with large amounts of data, this becomes time-consuming and resource-intensive. This is the reason why indexes are so important.
Indexes can be classified in several ways depending on the way they store data, their internal structure, their purpose, and the way they are defined. The following sections briefly describe these types of indexes.
A row-based index is a traditional index in which data is stored as rows in data pages. These indexes include the following:
Clustered indexes store and sort data based on the key column(s). There can only be one clustered index per table because data can be sorted in only one order. A clustered index is created by default when a table definition includes a primary key constraint.
Non-clustered indexes contain index key values and row locators that point to the actual data row. If there is no clustered index, the row locator is a pointer to the row. When there is a clustered index present, the row locator is the clustered index key for the row.
Non-clustered indexes can be optimized to satisfy more queries, improve query response times, and reduce index size. The two most important of these optimized non-clustered indexes are described in the next two sections.
Covering indexes are non-clustered indexes that include nonkey columns in the leaf level. These types of indexes improve query performance, cover more queries, and reduce IO operations as the columns necessary to satisfy a query are included in the index itself either as key or nonkey columns. Covering indexes can greatly reduce bookmark lookups.
The ability to include nonkey columns enables indexes to be more flexible by including columns with data types not supported as index key columns Nonkey columns also enable indexes to extend beyond the 16 key column limitation. Nonkey columns do not count towards the 900 byte index key size limit.
Filtered indexes can take a WHERE clause to indicate which rows are to be indexed. Since you index only a portion of rows in a table, you can create only a non-clustered filtered index. If you try to create a filtered clustered index, SQL Server returns a syntax error.
Why do you need a non-clustered index with a subset of data in a table? A well-designed filtered index can offer the following advantages over full-table indexes:
- Improved query performance and plan quality: If the index is deep (more pages because of more data), traversing an index takes more I/O and results in slow query performance. If you have a large table and you know that there are more user queries on a well-defined subset of data, creating a filtered index makes queries run faster because less I/O will be performed since the number of pages is less for the smaller amount of data in that filtered index. Moreover, stats on the full table may be less accurate compared to filtered stats with less data, which also helps improve query performance.
- Reduced index maintenance costs: Maintaining a filtered index is less costly than maintaining a full index because of smaller data size. Obviously, it also reduces the cost of updating statistics because of the smaller size of the filtered index. As mentioned earlier, you must know your user queries and what kind of data they query often to create a well-defined filtered index with a subset of that data. If the data outside of the filtered index is modified frequently, it won’t cost anything to maintain the filtered index. That enables you to create many filtered indexes when the data in those indexes is not modified frequently.
- Reduced index storage costs: Less data, less space. If you don’t need a full-table non-clustered index, creating a smaller dataset for a non-clustered filtered index takes less disk space.
Column-based indexes are a new type of index introduced in SQL Server 2012 in which only column data is stored in the data pages. These indexes are based on the Vertipaq engine implementation, which is capable of high compression ratios and handles large data sets in memory.