With the release of SQL Server 2008, we finally now have a single statement that allows an INSERT, UPDATE, and even DELETE within one TSQL statement. However, it seems to be rarely used. Let’s look at it once again and see if there are any cons to using it.
MERGE Syntax
The MERGE syntax is discussed in BOL and many other articles, but let’s look at how it was used in this test scenario. See Script 3 at the bottom of this article. Note that I used source and target as the aliases for the query. This helps me keep track of the structure of the query.
The source can be a table or a query (in this case a query). The ON clause is what “joins” the source and target and is actually the definition of the MATCH condition. When “Matched” and other conditions may apply (CUD in this specific instance) you normally perform an update. When “NOT MATCHED” (By TARGET is default), you normally would specify an INSERT statement as shown. The “NOT MATCHED BY SOURCE” is the normal condition when you want to do a hard delete on the SOURCE table.
One of the nice attributes of the MERGE statement is the $action parameter when combined with the inserted and deleted actions allows easy tracking of what rows were actually DELETED, INSERTED, or UPDATED.
Test Setup
The Sales.SalesOrderDetail table in the AdventureWorks2012 OLAP database was selected for its universal availability and for its size. To reduce the complexity of the test, all Foreign Keys, non-clustered indexes and the applicable trigger were removed to provide a relatively simple query plan.
Script 1 at the bottom of this paper shows the details for the setup. Since the table has roughly 120,000 rows I decided to look at the middle 40,000 rows dividing out 16,000 for update, 16,000 for insert and 8,000 for the delete. Script 2 shows the applicable update, insert, and delete statements used to modify the table. Script 3 shows the same logic only using the MERGE statement to achieve the same results.
Test Conduct
Before each test run, the database was restored back to original condition and Script 1 run to initialize the data needed for the test. Profiler was used to capture Reads, Writes, CPU, RowCounts, and Duration for each test. Scripts 2 and 3 were run (after DB restore and Script 1 execution) to gain comparison data. Tests were repeated several times and the values were consistent each time.
Test Results
The test results were not as expected. Many earlier articles on the MERGE statement have touted its reduced IO performance, but this was not the result as shown below:
Reads | Writes | CPU | Row Count | Duration | |
UPDATE | 51915 | 24 | 109 | 16175 | 118 |
INSERT | 55798 | 299 | 109 | 16176 | 263 |
DELETE | 27995 | 80 | 62 | 8089 | 102 |
135708 | 403 | 280 | 40440 | 483 | |
MERGE | 181367 | 442 | 1484 | 40440 | 2471 |
For all four performance measures, the combined individual UPDATE, INSERT, and DELETE statements outperformed the MERGE statement. I studied the individual query plans for each test case and saw no surprises – in fact, the query plans for the individual statements called for suggested indexes for all three statements while the MERGE query plan didn’t specify any suggested indexes.
I’m not sure what to make of these results. While logically it would seem that the MERGE statement would have less IO; for this specific instance (I think this test case is a common scenario), the MERGE statement performed worse than the 3 individual statements.
I commented out the OUTPUT part of the MERGE statement and that resulted in minimal change in the performance data. I think the only logical conclusion is that the query optimizer is more refined for the three individual operations than for the more recent MERGE operation.
While I like the concept of the MERGE statement I have to conclude to be very careful in its application. Every scenario is different. As always, test, test, and test again!
Script 1
/* Script to setup tables before running merge vs update/insert/delete comparison Restore AdventureWorks2012 back to original state before running this script */ USE AdventureWorks2012 GO IF OBJECT_ID (N'tempdb..#Temp') IS NOT NULL DROP TABLE #Temp GO --get data for comparison (middle of adventureworks, 40,000+ rows; 16175 to update, 16176 to insert, 8089 to delete select *, '' CUD, Row_Number() OVER(ORDER BY SalesOrderDetailID) RowNumber INTO #Temp FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID BETWEEN 40439 AND 80878 --delete last 20% rows in temp to force delete in Merge DELETE #Temp WHERE SalesOrderDetailID BETWEEN 72790 AND 80878 --alternate rows for insert/update UPDATE #Temp SET CUD = CASE WHEN RowNumber % 2 = 1 THEN 'I' ELSE 'U' END GO --Drop existing nonclustered indexes, Foreign Keys, and trigger to simplify example DROP INDEX [AK_SalesOrderDetail_rowguid] ON [Sales].[SalesOrderDetail] WITH ( ONLINE = OFF ) DROP INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] WITH ( ONLINE = OFF ) ALTER TABLE [Sales].[SalesOrderDetail] DROP CONSTRAINT [FK_SalesOrderDetail_SalesOrderHeader_SalesOrderID] ALTER TABLE [Sales].[SalesOrderDetail] DROP CONSTRAINT [FK_SalesOrderDetail_SpecialOfferProduct_SpecialOfferIDProductID] DROP TRIGGER [Sales].[iduSalesOrderDetail] GO --DELETE rows out of Sales.SalesOrderDetail needed for insert DELETE sod FROM Sales.SalesOrderDetail sod JOIN #Temp t ON t.SalesOrderDetailID = sod.SalesOrderDetailID WHERE t.CUD = 'I' GO --modify data for update/insert UPDATE #Temp SET UnitPrice = UnitPrice * 1.05, ModifiedDate = GETDATE() GO
Script 2
/* Script to test Update/Insert/Delete TSQL statements for comparison against Merge statement */ SET STATISTICS IO ON GO UPDATE sod SET UnitPrice = t.UnitPrice, ModifiedDate = GETDATE() FROM Sales.SalesOrderDetail sod JOIN #Temp t on t.SalesOrderDetailID = sod.SalesOrderDetailID WHERE t.CUD = 'U' SET IDENTITY_INSERT Sales.SalesOrderDetail ON GO INSERT Sales.SalesOrderDetail (SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,rowguid,ModifiedDate) SELECT SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,rowguid,ModifiedDate FROM #Temp t WHERE t.CUD = 'I' GO SET IDENTITY_INSERT Sales.SalesOrderDetail OFF GO DELETE sod FROM Sales.SalesOrderDetail sod WHERE NOT EXISTS (SELECT 1 FROM #Temp t WHERE t.SalesOrderDetailID = sod.SalesOrderDetailID) AND sod.SalesOrderDetailID BETWEEN 40439 AND 80878 GO SET STATISTICS IO OFF GO
Script 3
/* Script to test Update/Insert/Delete TSQL statements for comparison against Merge statement */ SET STATISTICS IO ON; SET IDENTITY_INSERT Sales.SalesOrderDetail ON; MERGE Sales.SalesOrderDetail as target USING (SELECT SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber, OrderQty,ProductID,SpecialOfferID,UnitPrice, UnitPriceDiscount,rowguid,ModifiedDate,CUD FROM #Temp) as source ON source.SalesOrderDetailID = target.SalesOrderDetailID WHEN MATCHED AND source.CUD = 'U' THEN UPDATE SET target.UnitPrice = source.UnitPrice, target.ModifiedDate = GETDATE() WHEN NOT MATCHED THEN INSERT (SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,rowguid,ModifiedDate) VALUES (SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,rowguid,ModifiedDate) WHEN NOT MATCHED BY SOURCE AND target.SalesOrderDetailID BETWEEN 40439 AND 80878 THEN DELETE OUTPUT $action, DELETED.SalesOrderDetailID AS TargetSalesOrderDetailID, INSERTED.SalesOrderDetailID AS SourceSalesOrderDetailID INTO #Output; SELECT @@ROWCOUNT; SELECT [Action], Count(*) FROM #Output GROUP By [Action] GO SET IDENTITY_INSERT Sales.SalesOrderDetail OFF GO SET STATISTICS IO OFF GO
Leave a Reply