Tips & Tidbits

RDS MySQL Performance Tuning - Speeding up slow inserts

If you're running a write heavy application with a MySQL database you might notice a fairly significant slowdown as the write load on the server increases. In our case, we're running a Laravel based event sourced service which is writing incoming events to read tables. Under heavy write load, simple inserts and updates with small payloads were pushing the 250ms mark. Here's how you can speed up those slow writes.

Use a server backed by performant SSD storage

MySQL (with default settings) writes the log buffer to disk after every transaction. Under high load, this causes a ton of small writes which are particularly slow on traditional platter based hard drives.

In our case, we had undersized the Amazon RDS instance disk size, which has a direct impact on write performance (higher capacity means more provisioned IOPS/write bandwidth). Increasing the disk size dropped out write average down about 50ms.

TL;DR: Make sure the disk for your server is as fast as possible

Increase the log file size limit

The default innodb_log_file_size limit is set to just 128M, which isn't great for insert heavy environments. Increasing this to something larger, like 500M will reduce log flushes (which are slow, as you're writing to the disk). This is particularly important if you're inserting large payloads.

In our case, bumping the log file size didn't actually help us much, since we had many smaller reads (not enough to overflow the log file frequently).

Defer disk writes by flushing the log less frequently

By default MySQL will write the log to disk after every single transaction. In our case, it wasn't possible to wrap batches of inserts in transactions, so every single insert query was causing a disk write.

You can instead write the log to disk at an interval by setting innodb_flush_log_at_trx_commit to 2. The main issue with setting this away from the default of 1 is possible data loss. While you're still protected from MySQL crashes causing data loss, the entire server loosing power would mean potentially losing data. For example, if you have the flush interval set to the default of 1, you could lose a seconds worth of writes upon failure of the server.

By setting innodb_flush_log_at_trx_commit to 2, we were able to drop the insert time average from over 200ms to under 50ms. While this is a pretty massive improvement, it's possible to squeeze even more speed if you are willing to risk a few more seconds of potential data loss.

You can increase the flush interval by setting innodb_flush_log_at_timeout to your desired interval (in seconds). In our case, we went with 5 seconds, which resulted in the insert time average dropping under 5ms! A massive difference from the original 250ms, and in our case worth the risk of 5 seconds of potential data loss.