Selecting Table Storage Engines in MySQL

By: Vikram Vaswami


MySQL supports many different storage engines for its tables, each with its own advantages and disadvantages. While all of MySQL’s storage engines are reasonably efficient, using the wrong storage engine can hinder your application from achieving its maximum possible performance. For example, using the ARCHIVE engine for a table that will see frequent reads and writes will produce significantly slower performance than using the MYISAM engine for the same table.

Selecting the Right Table Storage Engine in MySQL

To help you choose the most appropriate engine for your table, the following sections discuss each of these engines in greater detail.

The MyISAM Storage Engine

The MyISAM storage engine extends the base ISAM type with a number of additional optimizations and enhancements, and is MySQL’s default table type. MyISAM tables are optimized for compression and speed, and are immediately portable between different OSs and platforms (for example, the same MyISAM table can be used on both Windows and UNIX OSs). The MyISAM format supports large table files (up to 256TB in size) and allows indexing of BLOB and TEXT columns. Tables and table indexes can be compressed to save space, a feature that comes in handy when storing large BLOB or TEXT fields. VARCHAR fields can either be constrained to a specific length or adjusted dynamically as per the data within them, and the format supports searching for records using any key prefix, as well as using the entire key.

Because MyISAM tables are optimized for MySQL, it’s no surprise that the developers added a fair amount of intelligence to them. MyISAM tables can be either fixed-length or dynamic-length. MySQL automatically checks MyISAM tables for corruption on startup and can even repair them in case of errors. Table data and table index files can be stored in different locations, or even on different file systems. And intelligent defragmentation logic ensures a high-performance coefficient, even for tables with a large number of inserts, updates, and deletions. Large MyISAM tables can also be compressed, or “packed,” into smaller read-only tables that take up less disk space, with MySQL’s myisampack utility.

The InnoDB Storage Engine

The InnoDB storage engine has been a part of MySQL since MySQL 4.0. InnoDB is a fully ACID-compliant and efficient table format that provides full support for transactions in MySQL without compromising speed or performance. Fine-grained (row- and table-level) locks improve the fidelity of MySQL transactions, and InnoDB also supports nonlocking reads and multiversioning (features previously only available in the Oracle RDBMS). InnoDB tables can grow up to 64TB in size.

Asynchronous I/O and a sequential read-ahead buffer improve data retrieval speed, and a “buddy algorithm” and Oracle-type tablespaces result in optimized file and memory management. InnoDB also supports automatic creation of hash indexes in memory on an as-needed basis to improve performance, and it uses buffering to improve the reliability and speed of database operations. As a result, InnoDB tables match (and, sometimes, exceed) the performance of MyISAM tables.

InnoDB tables are fully portable between different OSs and architectures, and, because of their transactional nature, they’re always in a consistent state (MySQL makes them even more robust by checking them for corruption and repairing them on startup). Support for foreign keys and commit, rollback, and roll-forward operations complete the picture, making this one of the most full-featured table formats available in MySQL.

The Archive Storage Engine

The Archive storage engine provides a way to store large recordsets that see infrequent reads into a smaller, compressed format. The key feature of this storage engine is its ability to compress records as they are inserted and decompress them as they are retrieved using the zlib library. These tables are ideally suited for storage of historical data, typically to meet auditing or compliance norms.

Given that this storage engine is not designed for frequent reads, it lacks many of the bells and whistles of the InnoDB and MyISAM engines: Archive tables only support INSERT and SELECT operations, do not allow indexes (and, therefore, perform full table scans during reads), ignoreBLOB fields in read operations, and, by virtue of their on-the-fly compression system, necessarily display lower performance. That said, Archive tables are still superior to packed MyISAM tables because they support both read and write operations and produce a smaller disk footprint.

The Federated Storage Engine

The Federated storage engine implements a “stub” table that merely contains a table definition; this table definition is mirrored on a remote MySQL server, which also holds the table data. A Federated table itself contains no data; rather, it is accompanied by connection parameters that tell MySQL where to look for the actual table records. Federated tables thus make it possible to access MySQL tables on a remote server from a local server without the need for replication or clustering.

Federated “stub” tables can point to source tables that use any of MySQL’s standard storage engines, including InnoDB and MyISAM. However, in and of themselves, they are fairly limited; they lack transactional support and indexes, cannot use MySQL’s query cache, and are less than impressive performance-wise.

The Memory Storage Engine

The Memory storage engine, as the name suggests, implements in-memory tables that use hash indexes, making them at least 30 percent faster than regular MyISAM tables. They are accessed and used in exactly the same manner as regular MyISAM or ISAM tables. However, the data stored within them is available only for the lifetime of the MySQL server and is erased if the MySQL server crashes or shuts down. Although these tables can offer a performance benefit, their temporary nature makes them unsuitable for uses more sophisticated than temporary data storage and management.

The CSV storage engine provides a convenient way to merge the portability of text files with the power of SQL queries. CSV tables are essentially plain ASCII files, with commas separating each field of a record. This format is easily understood by non-SQL applications, such as Microsoft Excel, and thus allows data to be easily transferred between SQL and non-SQL environments. A fairly obvious limitation, however, is that CSV tables don’t support indexing and SELECT operations must, therefore, perform a full table scan, with the attendant impact on performance. CSV tables also don’t support the NULL data type.

The MERGE Storage Engine

A MERGE table is a virtual table created by combining multiple MyISAM tables into a single table. Such a combination of tables is only possible if the tables involved have completely identical table structures. Any difference in field types or indexes won’t permit a successful union. A MERGE table uses the indexes of its component tables and doesn’t maintain any indexes of its own, which can improve its speed in certain situations. MERGE tables permit SELECT, DELETE, and UPDATE operations, and can come in handy when you need to pull together data from different tables or to speed up performance in joins or searches between a series of tables.

The ISAM Storage Engine

ISAM tables are similar to MyISAM tables, although they lack many of the performance enhancements of the MyISAM format and, therefore, don’t offer the optimization and performance efficiency of that type. Because ISAM indexes cannot be compressed, they use fewer system resources than their MyISAM counterparts. ISAM indexes also require more disk space, however, which can be a problem in small-footprint environments.

Like MyISAM, ISAM tables can be either fixed-length or dynamic-length, though maximum key lengths are smaller with the ISAM format. The format cannot handle tables greater than 4GB, and the tables aren’t immediately portable across different platforms. In addition, the ISAM table format is more prone to fragmentation, which can reduce query speed, and has limited support for data/index compression.

NOTE

MySQL versions prior to MySQL 5.1 included the ISAM storage engine primarily for compatibility with legacy tables. This storage engine is no longer supported as of MySQL 5.1.

The NDB storage engine implements a high-availability in-memory table type designed only for use in clustered MySQL server environments. The NDB format supports large table files (up to 384EB in size), variable-length fields, and replication. However, NDB tables don’t support foreign keys, savepoints, or statement-based replication, and limit the number of fields and indexes per table to 128.

NOTE

A new addition to MySQL is the Blackhole storage engine. As you might guess from the name, this is MySQL’s equivalent of a bit bucket: Any data entered into a Blackhole table immediately disappears, never to be seen again. This storage engine isn’t just the MySQL development team’s idea of a joke, however—it does have some utility as a “cheap” SQL syntax verification tool, a statement logger, or a replication filter.

Can I Define How Much Memory a Memory Table Can Use?

Yes, the size of Memory tables can be limited by setting a value for the ‘max_heap_table_size’ server variable.

What Is a Temporary Table? Is It the Same as a Table Created with the Memory Storage Engine?

No. Memory tables, which are created by adding the ENGINE=MEMORY modifier to a CREATE TABLE statement, remain extant during the lifetime of the server. They are destroyed once the server process is terminated; however, while extant, they are visible to all connecting clients.

Temporary tables, which are initialized with the CREATE TEMPORARY TABLE statement, are a different kettle of fish. These tables are client-specific and remain in existence only for the duration of a single client session. They can use any of MySQL’s supported storage engines, but they are automatically deleted when the client that created them closes its connection with the MySQL server. As such, they come in handy for transient, session-based data storage or calculations. And, because they’re session-dependent, two different client sessions can use the same table name without conflicting.

Storage Engine Selection Checklist

To decide the most appropriate storage engine for a table, take into account the following factors:

  • Frequency of reads versus writes
  • Whether transactional support is needed
  • Whether foreign key support is needed
  • Indexing requirements
  • Table size and speed at which it will grow
  • OS/architecture portability
  • Future extendibility requirements and adaptability to changing data requirements

It’s worth noting, also, that MySQL lets you mix and match storage engines within a database. So you could use the MyISAM engine for tables that see frequent SELECTs and use InnoDB tables for tables that see frequent INSERTs or transactions. This ability to select storage engines on a per-table basis is unique to MySQL and plays a key role in helping it achieve its blazing performance.

Leave a Reply