One of the things we have been working on for MySQL 8, is speeding up scans that read multiple records from a table or an index. Such scans benefit from reading the records in batches, so that they don’t get the overhead of latching data pages and navigating in the B-tree for every single record that is read. When the MySQL server asks the storage engine for a record, the storage engine may therefore choose to prefetch a batch of records into a buffer, so that it can quickly return a record from the buffer when the server asks for the next record.
In previous versions of MySQL, the storage engine has no knowledge about how likely it is that the server asks for more rows in the scan, so it had to guess if it is worthwhile to prefetch a batch of records.
In the case of InnoDB, scans started by reading and returning a single record at a time for the first three records. When the MySQL server requested the fourth record, InnoDB assumed that the scan would read even more records, and thus it would allocate and fill a small prefetch buffer to speed up subsequent requests.
Our testing showed that while this prefetch buffer does indeed speed up scans, it could be improved:
- The size of the prefetch buffer was limited to 8 records. Large scans often benefit from a larger prefetch buffer.
- If the storage engine knew up front that multiple records would be fetched, it could have allocated the buffer at the beginning of the scan. This eliminates the need to read the first four records one by one.
Improvements in MySQL 8.0
In MySQL 8, we have improved the prefetching by taking the query optimizer’s estimates into account when allocating the buffer (WL#7093 – Optimizer provides InnoDB with a bigger buffer). Before starting a multi-record scan, the query executor will ask the storage engine whether it can make use of a buffer for fetching batches of records. If the storage engine says it can, the query executor allocates a buffer and gives it to the storage engine. The query executor has access to the optimizer’s estimates, and it uses the estimates to determine what is a reasonable buffer size. If the optimizer has estimated that a large number of records will be retrieved in the scan, a larger buffer can be allocated, which could greatly reduce the number of times latches need to be acquired on the B-tree pages. At the moment, InnoDB is the only storage engine that takes advantage of this buffer, but it is a generic optimization that could be used by other storage engines.
So, what kind of improvements can we expect to see in MySQL 8? To find out, I used SysBench to compare the performance of MySQL Server 5.7.14 and MySQL Server 8.0.0 DMR. I chose to run the OLTP simple ranges query, as it seemed like a good candidate to benefit from these changes. The query selects multiple records whose primary keys are in a certain range (
SELECT c FROM t WHERE id BETWEEN <start> AND <end>). I built a test table with ten million records in it, and ran the test on a 16-core Intel Xeon 2.90GHz with 128GB RAM running Oracle Linux 6.5. The MySQL server instances were configured with innodb_buffer_pool_size=16G, so that the entire database resided in main memory.
The charts below summarize the the results for various combinations of range sizes and number of concurrent threads that access the database. The first chart shows the single-threaded performance, and the second one shows the multi-threaded performance.
For small ranges the performance is essentially the same on 5.7.14 and 8.0.0. When selecting larger ranges, however, there seems to be a good performance improvement with MySQL 8.0.0 DMR compared to MySQL 5.7.14. In this test, when selecting ranges of 1000 rows or 10 000 rows, MySQL 8.0.0 DMR performed between 7% and 10% better than MySQL 5.7.14.
We also see good performance improvements with MySQL 8 in other benchmarks. For example, the chart below shows results from the DBT-3 benchmark. Many of the DBT-3 queries get a nice speedup. For some of them, the execution time is reduced by more than 15%.
This optimization applies on InnoDB tables by default starting from MySQL 8.0. You do not need to enable any settings or run any commands to turn it on, and it makes scans faster (index scan, range scan, table scan). It also applies to both single threaded, and multi-threaded query execution.
Please try it out, and let me know your results!