Why Use Both Partitioned Tables and Indexes?

on January 26, 2013


SQL Server partitioned tables are a way to spread a single table over multiple partitions, and while doing so each partition can be on a separate filegroup. Following are several reasons for doing this:

  • Faster and easier data loading: If your database has a large amount of data to load, you might want to consider using a partitioned table. “A large amount of data,” doesn’t mean a specific amount of data, but any case in which the load operation takes longer than is acceptable in the production cycle. A partitioned table enables you to load the data to an empty table that’s not in use by the “live” data, so it has less impact on concurrent live operations. Clearly, there will be an impact on the I/O subsystem, but if you also have separate filegroups on different physical disks, even this has a minimal impact on overall system performance. After the data is loaded to the new table, you can perform a switch to add the new table to the live data. This switch is a simple metadata change that quickly executes, which is why partitioned tables are a great way to load large amounts of data with limited impact to users who touch the rest of the data in the table.
  • Faster and easier data deletion or archival: For the same reasons, partitioned tables also help you to delete or archive data. If your data is partitioned on boundaries that are also the natural boundaries on which you add or remove data, the data is considered to be aligned. When your data is aligned, deleting or archiving data is as simple as switching a table out of the current partition, after which you can unload or archive it at your leisure. There is a bit of a catch to this part: With archiving, you often want to move the old data to slower or different storage. The switch operation is so fast because all it does is change metadata. It doesn’t move any data around, so to actually move the data from the filegroup where it lived to the old, slow disk archival filegroup, you need to move the data, but you move it when the partition isn’t attached to the existing partitioned table. Therefore, although this may take quite some time, it can have a minimal impact on any queries executing against the live data.
  • Faster queries: You are probably interested in an opportunity to get faster queries. When querying a partitioned table, the query optimizer can eliminate searching through partitions that it knows won’t hold any results. This is referred to as partition elimination. This works only if the data in the partitioned table or index is aligned with the query. That is, the data must be distributed among the partitions in a way that matches the search clause on the query. You learn more details about this as you consider how to create a partitioned table. SQL Server 2008 offers some improvements for parallel query processing enhancements on partitioned tables and indexes.
  • Sliding windows: A sliding window is basically what was referred to earlier in the discussion about adding new data and then deleting or archiving old data. What you did was fill a new table, switch it into the live table, and then switch an existing partition out of the live table for archival or deletion. It’s kind of like sliding a window of new data into the current partitioned table, and then sliding an old window of data out of the partitioned table.

Creating Partitioned Tables

Table partitioning requires SQL Server 2012 Enterprise Edition. There are also some expectations about the hardware in use, in particular the storage system; although these are implicit expectations, and you can store the data anywhere you want. You just won’t get the same performance benefits you would get if you had a larger enterprise storage system with multiple disk groups dedicated to different partitions.

SQL Server 2012 supports up to 15,000 partitions by default and is fully supported in 64-bit systems. In 32-bit systems, it is possible to create more than 1,000 table or index partitions, but it is not fully supported.

To create a partitioned table or index, perform the following steps:

1.   Specify how the table or index is partitioned by the partitioning column, and the range of values included for each partition. Only one partitioning column can be specified. For example, to create four partitions based on a DateKey column, you execute the following command:

CREATE PARTITION FUNCTION DateKeyRange_PF (int)
AS RANGE LEFT FOR VALUES (20021231, 20031231, 20041231);

Either LEFT or RIGHT boundaries can be specified in the partition function. If no partition boundary is specified LEFT is used as default. Table 14-3 describes the partitions created by the preceding partition function.

Table 14-3: Partition Results for DateKeyRange Function.

Partition No.

Description

1

All records with DateKey <= 20021231

2

Records between Datekey>20021231 and Datekey<=20031231

3

Records between Datekey>20031231 and Datekey<=20041231

4

All records with DateKey > 20041231

2.   To determine the partition number where a record will be placed based on the DateKey column value, use the $PARTITION function as follows:

SELECT ‘20010601’ DateKey, $PARTITION.DateKeyRange_PF(20010601) PartitionNumber
UNION
SELECT ‘20030601’ DateKey, $PARTITION.DateKeyeRange_PF(20030601) PartitionNumber
UNION
SELECT ‘20040601’ DateKey, $PARTITION.DateKeyRange_PF(20040601) PartitionNumber
UNION
SELECT ‘20050601’ DateKey, $PARTITION.DateKeyeRange_PF(20050601)

 

3.   Now create a partition scheme. For example, create a partition scheme with four filegroups that can be used to hold the four partitions defined in the DateKeyRange_PF partition function as follows:

CREATE PARTITION SCHEME DateKeyRange_ps
AS PARTITION DateKeyRange_PF
TO (FileGroup1, FileGroup2, FileGroup3, FileGroup4, FileGroup5)

 

 

 

Related Posts

Comments

Leave a Reply