Removing Duplicate, Repeating or Unnecessary SQL Statements in MySQL Improves Performance

By Ronald Bradford on September 28, 2015

Eliminating overhead that adds unnecessary load to database servers when SQL statements are unnecessary can improve MySQL performance, including removing duplicate, repeating or unnecessary statements.

Removing Duplicate SQL Statements

Capture of all SQL statements for a given function or process will highlight any duplicate SQL statements that are executed to complete a specific request. The best practice is to enable the general query log in development environments. Analysis of all SQL statements should be the responsibility of the developer to ensure that only necessary SQL statements are executed. Adding instrumentation to your application to report the number of SQL statements and provide debugging for dynamic viewing of all SQL statements easily enables more information to identify duplicate statements. The use of application frameworks can be a primary cause of unnecessary duplicate SQL statements.

Removing Repeating SQL Statements

Many applications suffer from Row At a Time (RAT) processing. Also known as the N+1 problem, the cause is an outer loop that generates an SQL statement per row. This can result in hundreds to thousands of repeating SQL statements. In many situations, applying a single SQL statement to achieve Chunk At a Time (CAT) processing can eliminate repeating SQL statements. Using the capability of set processing, which is a strength of SQL, can greatly improve performance. Here is an example:

SELECT name FROM firms WHERE id=727;
SELECT name FROM firms WHERE id=758;
SELECT name FROM firms WHERE id=857;
SELECT name FROM firms WHERE id=740;
SELECT name FROM firms WHERE id=849;
SELECT name FROM firms WHERE id=839;
SELECT name FROM firms WHERE id=847;
SELECT name FROM firms WHERE id=867;
SELECT name FROM firms WHERE id=829;
SELECT name FROM firms WHERE id=812;
SELECT name FROM firms WHERE id=868;
SELECT name FROM firms WHERE id=723;

This sequential row processing can be rewritten as follows:

SELECT id, name
FROM firms
WHERE id IN (723, 727, 740, 758, 812, 829, 839, 847, 849, 857, 867, 868);

The profiling functionality can demonstrate the full overhead of this type of SQL execution:

SET PROFILING=1;
SELECT . . .
SHOW PROFILES;

SELECT 'Sum Individual Queries' AS txt,SUM(DURATION) AS total_time FROM
INFORMATION_SCHEMA.PROFILING WHERE QUERY_ID BETWEEN 1 AND 12
UNION
SELECT 'Combined Query',SUM(DURATION) FROM INFORMATION_SCHEMA.PROFILING WHERE
QUERY_ID = 13;

The use of application frameworks can be a primary cause of unnecessary repeating SQL statements.

This example of repeating SQL statements is part of a more common N+1 scenario, where the SQL developer does not understand how to use table joins correctly. This practice involves a loop on an outer set of records, and then querying a subsequent table for all records in the loop. The result is the combined SELECT statement can be eliminated completely. Here is an example:

SELECT a.id, a.firm_id, a.title
FROM   article a
WHERE  a.type=2
AND    a.created > '2011-06-01';

// For loop for all records
SELECT id, name
FROM   firm
WHERE  id = :firm_id;

This loop of SQL statements can be replaced with the following single SQL statement:

SELECT a.id, a.firm_id, f.name, a.title
FROM article a
INNER JOIN firm f ON a.firm_id = f.id
WHERE a.type=2
AND a.created > ‘2011-06-01’;

Removing Unnecessary SQL Statements

Applications that are modified and enhanced over time, such as the following examples, can introduce unnecessary SQL statements:

Information selected that is no longer required
Information selected that is used only for certain paths of a given function
Information that can be selected with a preceding SQL statement

Functionality that has grown over time can easily suffer from this. Capturing and reviewing all SQL statements might not indicate occurrences of unnecessary SQL. Understanding the application needs and what is presented to the end user or used in processing might be necessary to identify improvements. The classic SELECT * is an example for which there is no understanding of what data is actually required without full analysis of the code executing the SQL statement.