InnoDB Transparent PageIO Compression

We have released some code in a labs release that does compression at the InnoDB IO layer. Let me answer the most frequently asked question. It will work on any OS/File system that supports sparse files and has “punch hole” support. It is not specific to FusionIO. However, I’ve been told by the FusionIO developers that you will get two benefits from FusionIO + NVMFS, no fragmenation issues and more space savings because of a smaller file system block size. Why the block size matters I will attempt to explain next.

The high level idea is rather simple. Given a 16K page we compress it using your favorite compression algorithm and write out the only the compressed data. After writing out the data we “punch a hole” to release the unused part of the original 16K block back to the file system. Let me illustrate with an example:


D – Data
F – Free
C – Compressed

DCF are in 1K units

Write out [CCCC] and then free [FFFFFFFFFFFF] using punch hole. The obvious next questions are:

  1. What if the compression results in data that is > 16K? Yes that can happen.
  2. What is the saving if the data is compressed to say 45 bytes from 16K?

For the first case we end up wasting CPU cycles because we write out the uncompressed (or original) contents as is. Given the above example that would be:


For the second question there is some extra detail. The writes are always aligned on the file system block size. So, if the original 16K was compressed to 45 bytes and you are running on XFS with a block size of 4K the compressed data will be rounded to 4K and the punch hole will be on the remaining 12K.

The labs release works at a system wide level, this means that it will compress all InnoDB tablespaces, including the UNDO logs and the system tablespace. Currently the REDO log is not compressed, it is written out as is.

To get the most out of this feature you need to use the multiple page cleaner functionality that is part of MySQL 5.7.4. There is a new configuration variable:

–innodb-page-cleaners := 1..64

A good starting point is 2 x number of buffer pools. So, why is this parameter important? The compression is done in the page cleaner threads, the more the better.

For decompression the important parameter is –innodb-read-io-threads. When AIO is used, the labs release only works on Linux, the decompression is done in the AIO threads, again the more the better. Since this code is experimental I can’t really recommend anything specific. I’ve been testing with 32 threads, but I also use a very beefy server.

The official release has support for two compression algorithms:

  1.  LZ4
  2. ZLib

There is example code that is commented out to demonstrate how you can use LZO and LZMA. To enable these algorithms you will need to build from source and uncomment some lines from storage/innodb/cmake.innodb.

To select the algorithm for compression you can use –innodb-compression-algorithm := 0,1,2 Where:

0 – None
1 – ZLib
2 – LZ4

You can select the compression level by changing:

–innodb-compression-level := 1..9

For the labs release I’ve also introduced some new configuration variables, mainly for experimenting.

–innodb-read-async := boolean
–innodb-read-block := boolean

The above configuration parameters are dynamic and you can change the algorithm on the fly. Because it works at the page level, you can have a tablespace that has a mix of uncompressed, ZLib and LZ4 pages at the same time and it will all work effortlessly.

We like this idea because it will help us simplify the buffer pool code and that will allow us to do more optimizations in the buffer pool code more easily.

We think that there is a lot of room for improving on this idea. For example writing custom compression that exploits the page format.

3 thoughts on “InnoDB Transparent PageIO Compression

  1. How can I see if this is actually working? and if not, why it isn’t working?

    I’ve set innodb_compression_algorithm=2 (and i_c_punch_hole=ON) and I don’t notice any difference in the size of the ibd file on disk. (I did do an optimize table)

Leave a Reply to Daniël van Eeden Cancel reply

Your email address will not be published. Required fields are marked *

Please enter * Time limit is exhausted. Please reload CAPTCHA.