A surprising number of people in online forums request information about slow queries without having tried to add an index to a frequently accessed field. As you know from Chapter 3, tables with fields that are accessed frequently can be ordered by creating an index. An index points to the place on a database where specific data is located, and creating an index on a field sorts the information in that field. When the server needs to access that information to execute a query, it knows where to look because the index points to the relevant location.
Indexing is even more important on multitable queries. If it takes a while to do a full table scan on one table, imagine how much longer it would take if you have several tables to check. If optimization of your queries is a goal, the first thing to do is to try implementing an index.
Deciding which fields should be indexed involves several considerations. If you have a field involved in searching, grouping, or sorting, indexing it will likely result in a performance gain. These include fields that are part of join operations or fields that appear with clauses such asWHERE, GROUP BY, or ORDER BY.
Consider the following example:
SELECT a.AircraftID, at.AircraftName FROM aircraft AS a JOIN aircrafttype AS at ON a.AircraftTypeID = at.AircraftTypeID;
The fields that should be indexed here are aircraft.AircraftTypeID and aircrafttype AircraftTypeID because they’re part of a join. If this query is commonly repeated with the same WHERE or HAVING clause, then the fields used in those clauses would also be a good choice for indexing.
Another factor to consider here is that indexes on fields with many duplicate values won’t produce good results. A table column that contains only “yes” or “no” values won’t be improved by indexing. On the other hand, a field where the values are unique (for example, employee Social Security numbers) can benefit greatly from indexing.
You can associate multiple nonunique indexes with a table to improve performance. No limit exists to the number of nonunique indexes that can be created.
Taking this to its logical extreme, then, you might think the more indexes, the merrier. This is a fallacy: Adding an index doesn’t necessarily improve performance. Small tables, for example, don’t need indexing. In addition, every index takes up additional space on the disk—each indexed field requires MySQL to store information for every record in that field and its location within the database. As your indexes build, these tables begin to take up more room. Furthermore, indexing speeds up searches, but slows down write operations, such as INSERT, DELETE, orUPDATE. Until you work with indexing on your database, your first few attempts might not achieve much performance gain.
Certain administrative counters can help you monitor your indexes or come up with candidates for adding an index. Both the SHOW STATUS or mysqladmin extended-status commands display values to consider in terms of indexes.
- If your indexes are working, the value of Handler_read_key should be high. This value represents the number of times a record was read by an index value. A low value indicates that not much performance improvement has been achieved by the added indexing because the index isn’t being used frequently.
- A high value for Handler_read_rnd_next means your queries are running inefficiently and indexing should be considered as a remedy. This value indicates the number of requests to read the next row in sequence. This occurs when a table is scanned sequentially from the first record to the last to execute the query. For frequent queries, this is a wasteful use of resources. An associated index points directly to the record(s), so this full table scan doesn’t need to occur. Poorly functioning indexes could also result in a high number here.
To view these counters, run a command like the one shown here:
mysql> SHOW STATUS LIKE 'handler_read%';
6 rows in set (0.01 sec)
If your SELECT statements frequently end up sorting results by a particular field, use the ALTER TABLE statement with an ORDER BY clause to re-sort the contents of the table by that field. Your SELECT statements will then no longer need an ORDER BY clause, resulting in faster and more efficient reads.
Once you’ve got your tables loaded with data and indexed the way you want them, you should run the ANALYZE TABLE command on them. This command analyzes the data in the table and creates table statistics on the average number of rows that share the same value. This information is used by the MySQL optimizer when deciding which index to use in table joins.
mysql> ANALYZE TABLE airport, aircraft, flight;
3 rows in set (0.00 sec)
It’s a good idea to run the ANALYZE TABLE command frequently, especially after you’ve added a significant amount of data to your table, to ensure that the optimizer is always using the most efficient index.