Dangers of Oracle Logical Primary Keys

By Michael Rosenblum, Paul Dorsey on May 28, 2015


Sometimes performance problems are initiated by requirements beyond the control of an average database developer. For example, in a number of front-end environments, you cannot use DMLs against a table or view if it does not have a primary key. With tables, that kind of problem is taken seriously and constraints are not added indiscriminately. Unfortunately, views are usually considered less important, and this may have some significant repercussions.

To illustrate the typical logic, use the tables TEST_TAB_MAIN and TEST_TAB_OTHER. Based upon the way in which they were loaded, values in the column OBJECT_ID are unique across both tables. For this exercise, make them unique only inside of a single table:

p0146-01

After these two updates, you cannot declare OBJECT_ID as a primary key for the view that adds those tables together by UNION ALL, since the unique logical identifier of the row in the view is a combination of two columns {Table Name; Object ID}.

The safe way out of this problem is to do multi-column lookups for all DMLs. However, it is common to see the view being altered in the following way:

p0146-02

A new synthetic column, PK_TX, is introduced to serve as a unique identifier of rows in the view. From a logical standpoint, there is nothing wrong with this approach. However, from the implementation side, it is a catastrophe. The front-end environment is requesting a PK column to use it in DML operations. This means that the following statements are generated under the hood:

p0146-03

It is obvious that Execution Plans for those statements would be suboptimal:

p0147-01

Even if you know that PK_TX is unique, Oracle has no way of getting that information. As a result, it has to duplicate the same activities for both TEST_TAB_MAIN and TEST_TAB_OTHER. First, it needs to build an expression for every row in the view (text string concatenated with OBJECT_ID). There are no exceptions! Second, the received values have to be filtered by a passed constant. This definitely sounds like a waste of resources when you needed to update just one value.

The resulting full table scan is unavoidable because expression-based primary keys of views do not have any physical meaning at the storage level. Having that kind of data access causes a lot of confusion among developers, but that is the price you pay for “point-and-click” GUI tools—or, to be precise, for not understanding what happens after you point and click.

Related Posts

Leave a Reply