When it is not possible to eliminate an SQL statement to improve performance, it might be possible to simplify the statement. Consider the following questions:
- Are all columns required?
- Can a table join be removed?
- Is a join or WHERE restriction necessary for additional SQL statements in a given function?
Column Improvement
An important requirement of simplification is to capture all SQL statements in order for a given executed function. Using a sampling process will not identify all possible improvements. Here is an example of a query simplification:
mysql> SELECT fid, val, val -> FROM table1 -> WHERE fid = X;
This query returned 350,000 rows of data that was cached by the application server during system startup. For this query, two-thirds of the result set that was passed over the network was redundant. The fid column was passed as a restriction to the SQL statement, and therefore the value was identical for all rows. The val column was also unnecessarily duplicated. The optimal SQL statement is shown here:
mysql> SELECT val -> FROM table1 -> WHERE fid = X;
Join Improvement
The following is an example of a table join simplification that requires knowledge of all SQL statements for a given function. During the process, the following SQL statement is executed to retrieve a number of rows:
mysql> SELECT /* Query 1 */ id FROM table1 -> WHERE col1 = X -> AND col2 = Y;
At a later time in the execution of this function, the following statement was executed using the id value from the previous SQL statement:
mysql> SELECT /* Query 2 */ table2.val1, table2.val2, table2.val3 -> FROM table2 INNER JOIN table1 USING (id) -> WHERE table2.id = 9 -> AND table1.col1 = X -> AND table1.col2 = Y -> AND table2.col1 = Z;
This second SQL statement could be simplified to this:
mysql> SELECT /* Query 2 */ val1, val2, val3 -> FROM table2 -> WHERE table2.id = 9 -> AND table2.col1 = Z
As the first query performs the necessary restriction for the id column (that is, col1 = X and col2 = Y), the join and restriction clauses for table1 are redundant. Removing this join condition simplifies the SQL statement and removes a potential problem if only one statement is altered in the future.
Rewriting Subqueries
The MySQL database supports subqueries. The performance of subqueries may be significantly slower in certain circumstances than using a normal table join. Here is an example:
SELECT id, label FROM code_opts WHERE code_id = (SELECT id FROM codes WHERE typ='CATEGORIES') ORDER BY seq
This SQL statement can simply be rewritten as follows:
SELECT o.id, o.label FROM code_opts o INNER JOIN codes c ON o.code_id = c.id WHERE c.typ='CATEGORIES' ORDER BY o.seq
The change might appear to be subtle; however, this approach for more complex queries can result in improved query performance.
Understanding the Impact of Views
Developers should know the true class of the table that is used for SQL statements. If the object is actually a view, the impact of SQL optimizations can be masked by the complexity of join conditions for the view definition. A common problem with data warehouse (DW) and business intelligence (BI) tools is the creation of views on top of views. For any view definition, additional information may be retrieved that is not ultimately required for an SQL statement. In MySQL, complex queries involving views can be easily improved by using the necessary underlying tables.
Leave a Reply