In this article I will explain the partial update optimizations for smaller (LOBs) in InnoDB. Small here qualifies the size of the modification and not the size of the LOB. For some background information about the partial update feature, kindly go through our previous posts on this (here, here and here).
Opportunity for Optimization
Within InnoDB, the LOB is stored in linked list of LOB data pages. Each of these LOB data pages contains one entry in the LOB index. When a partial update of a LOB happens in a transaction, at a minimum one LOB index entry will be modified. This means that at a minimum of one LOB data page needs to be copied. Suppose that only a few bytes are modified in an LOB, then an entire LOB page needs to be copied and modified because of this. We identified this as an opportunity to optimize.
Undo Logging Modifications to LOB
Normally the modifications to a LOB are not recorded in the undo log because of its size. In the case of full update, both the old and the new value of the LOB are stored as two separate LOBs in the user tablespace itself. In the case of partial update, the old and the new value of the LOB are stored together as a single LOB in the user tablespace. Because of this the update vector for LOBs are not stored in the undo log.
We decided to focus our optimization efforts specifically on the small modifications. Hence it made sense to consider undo logging of modifications to a LOB.
By small modification we mean 100 bytes per LOB. Refer to the constant lob::ref_t::LOB_SMALL_CHANGE_THRESHOLD for more information.
/** If the total number of bytes modified in an LOB, in an update
operation, is less than or equal to this threshold LOB_SMALL_CHANGE_THRESHOLD,
then it is considered as a small change. For small changes to LOB,
the changes are undo logged like any other update operation. */
static const ulint LOB_SMALL_CHANGE_THRESHOLD = 100;
Note that this threshold is for one LOB. One record can have multiple LOBs and many of them can be partially updated at the same time. This threshold was a design decision that I had to make. It might not be an ideal value, but hopefully we can fine tune it later on, if such a need arises. If you have any suggestions from the field on how to optimally tune this value then I’m very interested to in your feedback.
Format Change in Modify Undo Log
The structure of the undo log for update is documented by this diagram. In our optimization we now also include the update vector of LOBs in this undo log record. The following diagram marks (in red) the changes done to the undo log record format.
Note: Refer to the function trx_undo_page_report_modify() to study the way undo log for update is written.
The changes introduced are as follows:
- There was only one bit available in the type_cmpl flag. This bit is now named as TRX_UNDO_MODIFY_BLOB. If this bit is set, then it means that the undo log record format is now capable of representing small updates to LOBs.
- When the TRX_UNDO_MODIFY_BLOB bit is set, then a new 1-byte flag has been introduced for future extension. This new flag is currently unused and all the 8 bits are unset.
- When the old values in the update vector are logged in the undo log, it is checked if the field data is an LOB. If yes, then extra information is stored. We will see the kind of information stored in the next section.
Undo Information for Large Objects (LOB)
In the undo log record, additional information about changes to LOBs are added. The following diagram provides details about what kind of information is being stored.
In this diagram, only the first three fields are already existing. Everything else is additional information that is being stored. While updating a single LOB field, there can be an array of LOB deltas that needs to be applied. The undo log record stores such an array for a single LOB field.
The additional information stored in the undo log record will be used during MVCC, transaction rollback as well as during purge operation. With this in mind, both the changes to data and the meta data (the LOB index entry modificaitons) are being stored.
Contributes to Performance Improvement
The fundamental idea of this optimization was to make use of existing infrastructure for undo logging small changes to LOB. This optimization provides improvements to both space and time performance. The results are presented here.
The black line was the baseline and the red line is the performance gain from this feature. Both the baseline and red line involves partial update, but the red line does this optimization.
|Threads||Average TPS with the baseline code||Average TPS with Partial Update of BLOBs feature||Percentage Improvement|
The gain is seen from 88.28% to 287.45%. The sysbench was modified to use a bigger JSON document and the UPDATE statements were made to use JSON_SET() and JSON_REPLACE() whenever possible. The update involved changing only an integer within the JSON document.
The format of the undo log record for update operation has been changed to include the update vector of LOBs. The update vector of LOBs are included in the undo log record only for small changes to LOBs. A threshold has been defined to determine whether an update operation is small or not.
The additional information stored in the undo log record is used for three purposes – transaction rollback, MVCC and LOB purge operation. All necessary information to carry out these operations are stored in the undo log record. To keep this article short, I haven’t explained how this additional information is used in MVCC or rollback or purge. Probably they deserve a separate article. The main take away is to highlight that the undo log format is changed for performance improvement.
Thank you for using MySQL !