Although MySQL Query Cache was meant to improve performance, it has serious scalability issues and it can easily become a severe bottleneck.
This is indeed something we have observed in the MySQL team for a while. Before we get to the subject of today’s post, let me start with an introduction.
Introduction to Query Cache
The MySQL query cache is a query results cache. It compares incoming queries that start with SEL to a hash table, and if there is a match returns the results from the previous execution of the query. There are some restrictions:
- The query must match byte-for-byte (the query cache avoids parsing)
- Use of non-deterministic features will result in the query not being cached (including temporary tables, user variables, RAND(), NOW() and UDFs.)
- The query cache was designed to not serve stale results. Any modification to the underlying table(s) results in all cache being invalidated for those tables.
- There are some restrictions on if the cache can be used for InnoDB (to respect MVCC; as you have a transaction open, the ‘cache’ might not represent the data in your expected view.)
The Best Case Scenario
As I wrote on my personal blog some years ago:
The ideal scenario for the query cache tends to be largely read-only, where there are a number of very expensive queries which examine millions of rows only to return a few. A hypothetical example might be a complex query to build a list of values for a drop-down list that always appears on a webpage form. In a situation like this, the query cache can mask performance problems caused by missing indexes, which makes it helpful for novice users.
This comment still yields true today, but I think it is important to also point out that DBA tools for poor-application-intervention have also improved:
- In the MySQL server, we now have the ability to rewrite queries to insert hints (or other modifications to improve performance)
- We have third-party tools like ProxySQL, which can act as a man-in-the-middle query cache. ProxySQL also supports a TTL for cache, which works fine in the example I provided earlier (build a list of values for a drop-down list).
Limitations of the Query Cache
The query cache has been disabled-by-default since MySQL 5.6 (2013) as it is known to not scale with high-throughput workloads on multi-core machines. Rene confirmed this in his post yesterday, but it has also previously been mentioned by Stewart Smith, Domas Mituzas (update: and Kristian Koehntopp).
Assuming that scalability could be improved, the limiting factor of the query cache is that since only queries that hit the cache will see improvement; it is unlikely to improve predictability of performance. For user facing systems, reducing the variability of performance is often more important than improving peak throughput:
Decision to Remove Support for the Query Cache
We concur with the research performed by Jiamin Huang, Barzan Mozafari, Grant Schoenebeck, Thomas F. Wenisch at the University of Michigan, Ann Arbor. We considered what improvements we could make to query cache versus optimizations that we could make which provide improvements to all workloads.
While these choices themselves are orthogonal, engineering resources are finite. That is to say that we are shifting strategy to invest in improvements that are more generally applicable to all workloads.
We also agree with Rene’s conclusion, that caching provides the greatest benefit when it is moved closer to the client:
Upgrade Path for Existing Users
With the current limitations noted, the query cache will continue to be supported for the life time of MySQL 5.7. MySQL 8.0 will not support query cache, and users upgrading will be encouraged to use either Server-side Query Rewrite or ProxySQL as a man-in-the-middle cache.
We expect this change to only affect a small number of users, but if this concerns you, please reach out and get in touch!
Thank you for using MySQL!
Update: This post is being discussed on HackerNews