A Multi-Table Trick to Speed up Single-Table UPDATE/DELETE Statements

In MySQL, query optimization of single-table UPDATE/DELETE statements is more limited than for SELECT statements. I guess the main reason for this is to limit the optimizer overhead for very simple statements. However, this also means that optimization opportunities are sometimes missed for more complex UPDATE/DELETE statements.

Example

Using the DBT-3 database, the following SQL statement will increase prices by 10% on parts from suppliers in the specified country:

Visual EXPLAIN in MySQL Workbench shows that the optimizer will choose the following execution plan for this UPDATE statement:

update-subquery

That is, for every row in the part table, MySQL will check if this part is supplied by a supplier of the requested nationality.

Consider the following similar SELECT statement:

In MySQL 5.6, the query optimizer will apply semi-join transformation to this query. Hence, the execution plan is quite different from the similar UPDATE statement:

select-semijoin

As you can see, there is no sub-query in this plan. The query has been transformed into a three-way join. The great advantage of this semi-join transformation is that the optimizer is now free to re-arrange the order of the tables to be joined. Instead of having to go through all 179,000 parts, it will now start with the estimated 414 suppliers from the given country and find all parts supplied by them. This is obviously more efficient, and it would be good if MySQL would use the same approach for the UPDATE statement.

The Multi-Table Trick

Unlike single-table UPDATE statements, the MySQL Optimizer will use all available optimizations for multi-table UPDATE statements. This means that by rewriting the query as follows, the semi-join optimizations will apply:

Notice the extra dummy table in the first line. Here is what happens when I execute the single-table and multi-table variants on a DBT-3 database (scale factor 1):

As you can see, execution time is reduced from 2.63 seconds to 0.40 seconds by using this trick. (I had executed both statements several times before, so the reported execution times are for a steady state with all accessed data in memory.)

Multi-Table DELETE

The same trick can be used for DELETE statements. Instead of the single-table variant,

you can use the equivalent multi-table variant:

This rewrite gives a similar performance improvement as reported for the above UPDATE statement.

6 thoughts on “A Multi-Table Trick to Speed up Single-Table UPDATE/DELETE Statements

    1. Hi Rick,

      AFAIU, single-table UPDATE/DELETE has always had a different code path for optimizations compared to SELECT. The main reason for this was probably that many of the optimizations for SELECT are not relevant for single-table UPDATE/DELETE (e.g., join order, aggregation). When multi-table UPDATE/DELETE was later introduced, more of the SELECT optimizations were relevant (e.g., join order), and the implementation was built on top of the SELECT optimization.

      Sinced the code for single-table and multi-table is different, there is always a risk that the implementations diverge and show different behavior. For example, I have observed that index-merge will be used for multi-table DELETE in cases where it is not used for single-table DELETE.

      You are right that the benefit of semi-join optimization as described in this blog post is a new feature in MySQL 5.6. If single-table statements were to benefit from this optimization, it would require converting the single-table statement into a multi-table statement. That is not easy to achieve in the current MySQL code. Hopefully, we will be able to refactor this code some time in the future so that all UPDATE/DELETE statements will share the same implementation.

  1. Hi Øystein,

    I still don’t understand how the dummy table is equivalent to the three way to join.

    I also don’t understand how you did the multi-table delete. Can you really issue this mysql statement: DELETE tablename FROM tablename?

    Thanks!

    1. Hi Fadi,

      The role of the dummy table is to change the statement so that the SQL parser recognize it as a multi-table update. That way, we get a different code path with opportunities for subquery optimizations. It is the semi-join optimization (new in MySQL 5.6) that converts the subquery into a three-way join.

      The syntax for multi-table delete (see link in blog post) is either “DELETE tables to delete rows from FROM tables to join …” or “DELETE FROM tables to delete rows from USING tables to join …” Since “DELETE tablename FROM tablename” matches the multi-table syntax and not the single-table syntax, this statement is handle as a multi-table delete.

Leave a Reply

Your email address will not be published. Required fields are marked *


nine − 4 =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">