Optimizing MySQL Stored Routines

on September 28, 2015

Stored routines are one of the newer additions to MySQL and, as such, are significantly less optimized than other components of the server. As a result, it’s generally not advisable for your application to rely entirely on stored routines, as doing so will likely degrade performance significantly. That said, stored routines do have a role to play in application development and cannot be ignored completely. The following section provides some tips that should help in developing stored routines that don’t add undue overhead to an application.

Follow the KISS Principle

MySQL compiles and stores execution plans for stored routines per connection thread. As a result, as the number of clients accessing a stored routine increases, so does the CPU and memory usage required by the stored routine parser. In addition, because there currently exists no mechanism to track cache hits per client or per routine, the server cannot dynamically optimize its caching plan for stored routines. This too results in memory being consumed without necessarily being reclaimed in an optimal fashion.

Keeping these limitations in mind, it’s important that stored procedures and functions in MySQL be as simple as possible. Not only do complex stored routines consume more memory, but they take longer to process, straining both the CPU and your application user’s patience. In a similar vein, recursive or deeply nested stored routines and stored routines that perform large transactions or use long prepared statements can quickly gobble up server memory and CPU cycles, slowing down other threads accessing the server and affecting overall performance.

Optimize SQL Statements Within Routines

In the final analysis, stored routines are simply containers for blocks of SQL statements. Therefore, it follows that optimizing those SQL statements is the best way to improve stored routine performance. Stored routines that contain loops are the ideal targets for this kind of optimization; keeping in mind that statements within a loop are repeatedly executed, it’s often possible to improve performance by “cutting loop flab” and moving out unnecessary statements from the loop. Consider the following simple example, which illustrates:

seg: LOOP
  SET @total = @total + 1;
  IF dt BETWEEN '00:00:00' AND '05:59:59' THEN
      UPDATE summary SET night = night + 1;
  ELSEIF dt BETWEEN '06:00:00' AND '11:59:59' THEN
      UPDATE summary SET morning = morning + 1;
  ELSEIF dt BETWEEN '12:00:00' AND '17:59:59' THEN
      UPDATE summary SET afternoon = afternoon + 1;
  ELSEIF dt BETWEEN '18:00:00' AND '23:59:59' THEN
      UPDATE summary SET evening = evening + 1;
  END IF;
END LOOP seg;

A better way to do this might be to store the separate counts in individual variables and write them to the summary table after the loop has completed. Similarly, it makes more sense to calculate the total after the loop has completed executing, instead of on each iteration. Here’s the revised loop:

seg: LOOP
  IF dt BETWEEN '00:00:00' AND '05:59:59' THEN
      SET n = n + 1;
  ELSEIF dt BETWEEN '06:00:00' AND '11:59:59' THEN
      SET m = m + 1;
  ELSEIF dt BETWEEN '12:00:00' AND '17:59:59' THEN
      SET a = a + 1;
  ELSEIF dt BETWEEN '18:00:00' AND '23:59:59' THEN
      SET e = e + 1;
  END IF;
END LOOP seg;
INSERT INTO summary (morning, afternoon, evening, night)
  VALUES (m, a, e, n);
SET @total = morning + afternoon + evening + night;

TIP

For an example of improving performance by rewriting MySQL cursors as joins, read Roland Bouman’s blog post on the topic, at http://rpbouman.blogspot.com/2006/09/refactoring-mysql-cursors.html.

Don’t Mix Stored Procedures and Triggers

Triggers with the FOR EACH ROW clause are executed for each record in a table. If these triggers are, in turn, linked to a stored procedure, MySQL will execute the stored procedure as many times as there are records in the table. Needless to say, this is expensive and usually unnecessary. A better approach is to make the trigger as specific as possible, making use of session variables or temporary tables for interim data storage or conditional testing, such that the stored procedure is called only when needed.

Optimizing Table Design

In the context of keeping your queries lean and mean, you need to consider several things in terms of table design. First, if a frequently queried table also gets a lot of changes, the way to improve performance is to stay with fixed-length fields, rather than variable-length ones. The trade-off is this: By definition, fixed-length fields take up a certain amount of space, regardless of the content, whereas variable-length fields adjust themselves depending on the data entered. Thus, you’re bound to waste more disk space using fixed-length fields. If it’s speed you’re after, however, MySQL will perform better with fixed- rather than variable-length fields.

If one field in a frequently changed table cannot be formatted to a fixed length, consider moving that field to a separate table and converting the rest of the fields in the original table to fixed-length fields. Although this might not be workable in all circumstances, it is a viable way to achieve the performance gain of using fixed-length fields rather than variable-length fields.

NOTE:

InnoDB tables handle row storage differently from MyISAM or ISAM tables. Using fixed-length fields instead of variable-length ones won’t result in a performance boost with these table types.

This being said, if you are going to use fixed-length fields, make sure the field size is kept to a minimum. For example, when designing a table, creating a CHAR(255) field is often easier than worrying about exactly how big you need to make it. In practice, you might find that a field half that size can adequately take care of your needs. Paring the field size not only takes up less disk space, it also means less I/O when processing—and deleting unnecessary fields entirely can also increase performance.

Another technique to improve performance is to use the OPTIMIZE TABLE command frequently on tables that are modified often (discussed more fully in Chapter 12). Frequent modification results in fragmentation, which, in turn, leads to extra time spent reading unused blocks of space to get at the desired data.

When considering ways of improving performance, check to see if you need all the tables you have set up. Again, when originally designing a table, dividing your data might have seemed like a good idea, but extra tables mean your performance will suffer. Look at the tables you join frequently. Is it possible to combine the data into one table instead? If you find you cannot, for whatever reason, try to match the fields you join. Queries will run more efficiently if joined fields are of the same data type and length

Optimizing Server Settings

If you want your server to perform optimally the best solution is to get tons of memory and big, fast drives. However, in most situations, these brute-force techniques won’t be an option. Given that we operate under less-than-ideal conditions, getting a handle on some subtler techniques for optimizing server performance makes more sense. Accordingly, this section gives a brief overview of some of the major things you can do to fine-tune your server.

As Chapter 10 discusses in detail, MySQL exposes a large number of variables whose values can be modified to meet custom requirements. Some of these variables can be set at the time of starting the MySQL server, and others can be set while the server is running.

When it comes to tuning server variables for maximum performance, the MySQL manual recommends that you first look at the key_buffer_size and table_cache variables.

• The key_buffer_size variable controls the amount of memory available for the MySQL index buffer. The higher this value, the more memory available for indexes and the better the performance. Typically, you would want to keep this value near 25 to 30 percent of the total available memory on the server.

• The table_cache variable controls the amount of memory available for the table cache, and thus the total number of tables MySQL can hold open at any given time. For busy servers with many databases and tables, this value should be increased so that MySQL can serve all requests reliably. Also relevant here is the max connections variable, because the MySQL manual recommends setting the table_cache value using the formula (table_cache = max_connections x N), where N is the number of tables in a typical join.

These values can be changed using the SET command, as in the example shown:

mysql> SET GLOBAL table_cache=200;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT @@table_cache;
+---------------+
| @@table_cache |
+---------------+
|           200 |
+---------------+
1 row in set (0.01 sec)

Note that once you change a global server variable, it remains in effect until the server is shut down. This means if you find a beneficial setting, you need to reset it on startup every time. Because this is cumbersome, it’s useful to know a way of making your changes permanent. This can be accomplished by setting a variable in an option file.

Once you’ve got your table cache and index buffer set up the way you want them, you can turn your attention to the various other memory buffers MySQL uses.

You can speed up queries that use the ORDER BY or GROUP BY clause to sort the result set by increasing the value of MySQL’s sort buffer, controlled via the sort_buffer variable. Also consider increasing the read_rnd_buffer_size variable to speed up reading of the sorted rows.
You can speed up SELECT queries that scan the table sequentially by increasing the size of MySQL’s read buffer via the read_buffer_size variable.
When performing a transaction, MySQL stores the statements that make up the transaction in a cache until it receives instructions to write them to the binary log and commit them to the database. For long or complex transactions, the size of this cache should be increased to obtain better performance via the binlog_cache_size variable.
If you’re planning on so-called “bulk inserts” (that is, inserting multiple records using a single INSERT command), you can speed things up by increasing the value of the bulk_insert_buffer_size variable. However, this only works with the MyISAM table type.
If you’re anticipating a lot of new connections to the server, it’s a good idea to increase the value of the thread_cache_size variable. This variable controls the size of the cache where server threads go when the client they’re servicing disconnects. Threads from this cache are then reused to service new connections. The higher this value, the more threads can be cached, and the better the response time to new connection requests.