Optimizing a MySQL Query

By: Ronald Bradford


Identifying a slow running SQL query is a necessary prerequisite for any type of optimization. Here, we will detail some of the tools and principles required to determine the various options for an ideal solution.

What You Should Not Do

If you lived in the wild west, where no rules applied, you might consider adding an index to this table based on the WHERE clause. Here is an example:

mysql> ALTER TABLE inventory ADD INDEX (item_id);
Query OK, 734787 rows affected (54.22 sec)
Records: 734787 Duplicates: 0 Warnings: 0

CAUTION

Do not try this in a production environment without additional verification!

There are many factors to choosing to add a new index and deploying it accordingly. This statement highlights just one potential impact on a production environment. This Data Definition Language (DDL) statement took about 55 seconds to complete. During that time, any additional queries that add or modify data for this table are blocked, because the ALTER statement is a blocking operation. Depending on the order in which other Data Manipulation Language (DML) statements are executed, SELECT statements are also blocked from completing during this time. For larger tables, an ALTER statement can takes hours, or even days, to complete! A second impact to consider is the performance overhead of DML statements when a table has multiple indexes.

Confirm Your Optimization

By re-running the SQL query, you can see an immediate improvement with the query now taking less than 10 milliseconds.

mysql> SELECT * FROM inventory WHERE item_id = 16102176;
Empty set (0.00 sec)

You can also confirm the effectiveness of the new index by looking at the revised QEP:

mysql> EXPLAIN SELECT * FROM inventory WHERE item_id = 16102176\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: inventory
         type: ref
possible_keys: item_id
          key: item_id
      key_len: 4
          ref: const
         rows: 1
        Extra:

The MySQL optimizer has now selected an index as indicated by the value in the key column, and the number of rows estimated to be examined during the execution of the SQL statement was 1, compared with the original value of 787,338.

The Correct Approach

Adding an index to a table offers benefits including performance optimization; however, there are always other implications for adding an index. Chapters 4 and 5 will discuss the pros and cons of adding indexes on table columns. Before you choose to add an index, you should always perform at least two checks: the first to verify the existing structure of the table, and the second to confirm the size of the table. You can obtain this information using the following SQL commands:

mysql> SHOW CREATE TABLE inventory\G
*************************** 1. row ***************************
Create Table: CREATE TABLE `inventory` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `supp_id` int(10) unsigned NOT NULL DEFAULT '0',
  `item_id` int(10) unsigned NOT NULL DEFAULT '0',
  `qty` int(11) NOT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
  PRIMARY KEY (`id`),
  UNIQUE KEY `supp_id` (`supp_id`,`item_id`),
  KEY `created` (`created`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1

mysql> SHOW TABLE STATUS LIKE 'inventory'\G
*************************** 1. row ***************************
           Name: inventory
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 679890
 Avg_row_length: 371
    Data_length: 252395520
Max_data_length: 0
   Index_length: 40861696
      Data_free: 0
 Auto_increment: 1612406
    Create_time: 2010-08-17 20:16:13
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment: InnoDB free: 644096 Kb

From these commands, you can determine that the current table structure includes a number of indexes, including an index that uses the item_id column. This index was not used, however, because the leftmost column of the index was not satisfied by this query. You also get an approximate size of the table by the Data_length and Rows information from the SHOW TABLE STATUS command. Chapters 4 and 5 will further discuss the importance of this information in determining the time impact of adding an index and the impact of having multiple indexes on the same column.

An Alternative Solution

By choosing to look at this SQL statement in isolation, the DBA or architect can elect to create an index, as described. The correct approach for optimizing SQL includes understanding and verifying the purpose for the SQL statement and related SQL statements for this table. By performing this analysis, you would highlight that the application code executing this SQL statement already maintains additional information to improve the query. The value for supp_id was known at the time this SQL statement was executed. By altering the SQL statement to include this column in the WHERE clause, the existing index would be used. No schema changes would be necessary to improve the SQL statement.

In this example, adding an index was not the ideal approach to addressing the observed slow query; without further analysis, the table would have the overhead of an additional unnecessary index.

Leave a Reply