Finding Out How a Query Will Be Processed in MySQL

on September 28, 2015

Attaching the EXPLAIN keyword to the beginning of a SELECT query tells MySQL to return a chart describing how this query will be processed. Included within this chart is information on which tables the query will access and the number of rows the query is expected to return. This information comes in handy to see which tables should be indexed to speed up performance and to analyze where the bottlenecks are.

CAUTION

Only queries that are textually exact will match what’s in the query cache; any difference will be treated as a new query. For example, SELECT * FROM airport won’t return the result from select * FROM airport in the cache.

As an example, consider the following query:

SELECT p.PaxName, f.FlightID
  FROM pax AS p,
  flight AS f, route AS r
  WHERE p.FlightID = f.FlightID
  AND p.ClassID = 2 AND r.Duration = 85;

Now, by adding the EXPLAIN keyword to the beginning of the query, one can obtain some information on how MySQL processes it:

mysql> EXPLAIN SELECT p.PaxName, f.FlightID
    -> FROM pax AS p,
    -> flight AS f, route AS r
    -> WHERE p.FlightID = f.FlightID
    -> AND p.ClassID = 2 AND r.Duration = 85\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 30
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: r
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 290
        Extra: Using where; Using join buffer
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: f
         type: eq_ref 
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 2
          ref: db1.p.FlightID
         rows: 1
        Extra: Using where
3 rows in set (0.00 sec)

This might all seem a little intimidating, so an explanation is in order. The result of EXPLAIN SELECT is a table listing all the SELECTs in the query, together with how MySQL plans to process them.

The id field indicates the position of the SELECT within the complete query, while the table field holds the name of the table being queried.
The select_type field indicates the type of query: a simple query without subqueries, a UNION, a subquery, an outer query, a subquery within an outer query, or a subquery in a FROM clause.
The type field indicates how the join will be performed. A number of values are possible here, ranging from const (the best kind of join, since it means the table contains a single matching record only) to all (the worst kind, because it means that MySQL has to scan every single record to find a match to records in the other joined tables).
The possible_keys field indicates the indexes available for MySQL to use in order to speed up the search.
The key field indicates the key it will actually use, with the key length displayed in the key_len field.
The rows field indicates the number of rows MySQL needs to examine in the corresponding table to successfully execute the query. To obtain the total number of rows MySQL must scan to process the complete query, multiply the rows value for each table together.
The Extra field contains additional information on how MySQL will process the query—say, by using the WHERE clause, by using an index, with a temporary table, and so on.

Now, from the previous output, it’s clear that in order to execute the query, MySQL will need to examine all the rows in two of the named tables. The total number of rows MySQL needs to scan, then, is approximately 290 × 30 = 8,700 rows—an extremely large number!

However, by reviewing the output of the EXPLAIN SELECT command output, it’s clear that there is room for improvement. For example, the possible_keys field for some of the tables is NULL, indicating that MySQL couldn’t find any indexes to use. This can quickly be rectified by reviewing the tables and adding indexes wherever possible:

mysql> ALTER TABLE pax ADD INDEX (ClassID);
Query OK, 30 rows affected (0.06 sec)
Records: 30  Duplicates: 0  Warnings: 0
mysql> ALTER TABLE route ADD INDEX (Duration);
Query OK, 290 rows affected (0.06 sec)
Records: 290  Duplicates: 0  Warnings: 0

Now, try running the query again with EXPLAIN:

mysql> EXPLAIN SELECT p.PaxName, f.FlightID
    -> FROM pax AS p,
    -> flight AS f, route AS r
    -> WHERE p.FlightID = f.FlightID
    -> AND p.ClassID = 2 AND r.Duration = 85\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: p
         type: ref
possible_keys: ClassID
          key: ClassID
      key_len: 4
          ref: const
         rows: 1
        Extra:
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: r
         type: ref
possible_keys: Duration
          key: Duration
      key_len: 2
          ref: const
         rows: 1
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: f
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 2
          ref: db1.p.FlightID
         rows: 1
        Extra: Using where; Using index
3 rows in set (0.00 sec)

As you can see, MySQL is now using the newly added indexes to cut down on the number of rows that need to be examined. Looking at the rows field for each table, we now see that MySQL only needs to scan one row in each table to process the query—a significant improvement over the earlier, nonindexed approach.

Finding Out How a Query Will Be Processed in MySQL

on September 28, 2015

Related Posts

Leave a Reply Cancel reply