Accelerating Large Queries with Storage Offload in Oracle Exadata

on May 30, 2015


Storage offload is a key performance optimization found only in Exadata. The Exadata storage server software understands Oracle database file layouts, so you can use smart scans to perform part of the work database servers normally do. By doing data processing closer to the storage, Exadata storage servers reduce the amount of data transferred to database servers and database-server processing power. Storage offload operations include the following:

  • Predicate filtering, acting on the WHERE clause of SQL statements to exclude rows that do not match the query results
  • Column projection, returning only the columns required by the query
  • Bloom filter offloading, where the storage software assists in performing bloom filter-based table joins

Smart Scan in Action

To get the most benefits from smart scan, you need a large table. In this example, we’ll use a large join from Oracle’s built-in DBA_SOURCE view to give us lots of rows to select from. From this row source, we’ll create a table with 1 million rows almost 8KB in size, for a table near 8GB in total bytes. We will be disabling compression and the flash cache, as these will be covered in future examples.

We will also disable in-memory parallel query by setting PARALLEL_DEGREE_POLICY to MANUAL because it can cause Oracle to attempt to cache the table in the buffer cache, affecting physical I/O statistics. These examples are designed to be run from a user other than the built-in SYS user. Feel free to create a new user just for these examples. (See Chapter 5 for more information about creating users.)

p389-01

And now we count the rows:

p389-02

At this point, we will retrieve two of the statistical counters that the Oracle database maintains for our session: the total bytes read from storage, and the total returned to the database servers via smart scan:

p389-03

This table has three columns: NUM, which is a simple number a few bytes in size, and two filler columns resulting in rows nearly 7 kilobytes in size. The 7KB row size ensures that each 8KB data block (using the default size) holds a single row. To count the rows, we only need the first column, and Exadata’s column filters indeed return us the first row only. It means that of 8192 million bytes of data, the database servers had to process only 129 million, a 98.5 percent reduction.

To look at predicate filtering, we’ll build on the last example. But instead of counting all the rows in the table, we’re going to count only those rows with a row number ending in the digit 8. This can be accomplished by using the SQL modulus operator.

Reconnect to SQL*Plus to reset the V$MYSTAT counters:

p390-01

Confirm that the MOD() function is offloadable:

p390-02

We now run the same COUNT(*) query as before, but add a SQL WHERE clause with the MOD() function:

p390-03

The count should return 100000, as we’re selecting 10 percent of the table:

p390-04

So instead of returning 129MB of data from the storage servers to database servers, transfer volume has dropped to 14MB through row predicate filtering.

Why Smart Scans May Not Happen

A commonly asked question is: “Why doesn’t my query use smart scan?” Under the hood, the Oracle software makes a number of checks designed to make sure that the smart scans can return correct results and give a performance advantage. Common reasons why storage operations aren’t offloaded include the following:

  • The operation wasn’t a full table scan or an index fast full scan.
  • The table is too small to offload efficiently.
  • There are no WHERE clause predicates to filter on.
  • Lack of a parallel query prevents direct path reads from happening.
  • Consistent read is required due to in-flight object modifications.
  • Operations on LOB or hash cluster objects.

Related Posts

Leave a Reply