Why SQL Server Update Operations Must Also Be Optimized

By:


Update operations are an intrinsic part of database operations, and they also need to be optimized so that they can be performed as quickly as possible. Keep in mind that “updates” in this section refer to any operation performed by the INSERT, DELETE, and UPDATE statements, as well as the MERGE statement, which was introduced in SQL Server 2008. In this section we will go over the basics of update operations and how they can quickly become complicated, as they need to update existing indexes, access multiple tables, and enforce existing constraints. We’ll see how the query optimizer can select per-row and per-index plans to optimize UPDATE statements, and go over the Halloween protection problem as well as how SQL Server avoids it.

Even when performing an update involves some other areas of SQL Server, such as transactions, concurrency control, or locking, update processing is still totally integrated within the SQL Server query processor framework. Update operations are also optimized so they can be performed as quickly as possible. So, in this section, we discuss updates from the query-processing point of view.

Update plans can be complicated because they need to update existing indexes alongside data. Also, because of objects such as check constraints, referential integrity constraints, and triggers, those plans may also have to access multiple tables and enforce existing constraints. Updates may also require the updating of multiple tables when cascading referential integrity constraints or triggers are defined. Some of these operations, such as updating indexes, can have a big impact on the performance of the entire update operation.

Update operations are performed in two steps, which can be summarized as a read section followed by the update section. The first step provides the details of the changes to apply and which records will be updated. For INSERT operations, this includes the values to be inserted, and for DELETE operations, it includes obtaining the keys of the records to be deleted, which could be the clustering keys for clustered indexes or the RIDs for heaps. Just to keep you on your toes, for update operations, a combination of both the keys of the records to be updated and the data to be inserted is needed. In this first step, SQL Server may read the table to be updated just like in any other SELECT statement. In the second step, the update operations are performed, including updating indexes, validating constraints, and executing triggers. The update operation will fail and roll back if it violates a constraint.

Let’s start with an example of a very simple update operation. Inserting a new record on the Person.CountryRegion table using the following query creates a very simple plan, as shown in Figure 1:

f0159-01

Figure 1 An insert example

p0159-01

However, the operation gets complicated very quickly when you try to delete the same record by running the next statement, as shown on the plan in Figure 2:

f0160-01

Figure 2 A delete example

As you can see in this plan, in addition to CountryRegion, three additional tables (StateProvince, CountryRegionCurrency, and SalesTerritory) are accessed. The reason behind this is that these three tables have foreign keys referencing CountryRegion, so SQL Server needs to validate that no records exist on these tables for this specific value of CountryRegionCode. Therefore, the tables are accessed and an Assert operator is included at the end of the plan to perform this validation. If a record with the CountryRegionCode to be deleted exists in any of these tables, the Assert operator will throw an exception and SQL Server will roll back the transaction, returning the following error message:

p0160-01

As you can see, the previous example shows how update operations can access some other tables not included in the original query—in this case, because of the definition of referential integrity constraints.

Leave a Reply