A new, simple way to figure out why your recursive CTE is running away

In MySQL 8.0.1, we introduced support for recursive common table expressions (CTE). There are quite a few blog entries showcasing the feature, starting from this one, and there is also a complete documentation. Today, I would like to present a solution to a problem which nearly everybody meets when writing queries with recursive CTE’s: when infinite recursion happens, how to debug ?

Consider this example query, which generates integers from 1 to 5:

This query works perfectly, here is its result:

Now, imagine that I made a little mistake when typing it (it was just a keyboard accident, typed “*” for “<“):

The now-wrong WHERE condition is translated by MySQL to “n*5 <> 0”, which is true for all rows. So the recursive algorithm generates more and more rows, does more and more iterations, until the default maximum on the number of iterations is reached, causing this error:
“ERROR 3636 (HY000): Recursive query aborted after 1001 iterations. Try increasing @@cte_max_recursion_depth to a larger value.”

In our case, increasing this variable wouldn’t improve anything. The solution is rather to review the query, carefully, until we spot the typing mistake.

While this was a toy example, there are other reasons why a CTE could recurse forever: the query could be very complex, and we made a logical error; or the data set could be a malformed hierarchy and contain an unexpected circular loop. Or, everything was correct at development time, but, after some time, the data model evolved, and circular loops, which were originally impossible, are now perfectly possible, and the query needs some fixing to not choke on them.

To give an example of this, I will reuse the scenario of rockets. I copy it again here.

In year 2300, the Earth is overcrowded and people are encouraged to emigrate to nearby planets by boarding on these space rockets:

Note that the rulers of the Earth have intentionally not set up any way to return from those planets back to the Earth. Let’s list all destinations which can be reached from the Earth: we first find the planets which can be directly reached, then we find what can be directly reached from these first planets, and so on:

Now it is year 2400 and the population of the Earth has decreased so much that the rulers decide to bring some emigrates back, so they set up a new rocket from Saturn to Earth:

Let’s repeat our query to list all destinations which can be reached from the Earth:

Data has evolved, long after the development of the query, which now needs some fixing. And, of course, the developer who has to fix it does not know that the new Saturn-to-Earth rocket is the cause. He’s looking for ways to discover it.

So, whatever the reason, all you have is error 3636 above, and what do you do now?
If the query has tens of lines, how do you get an idea of where the logical error is?
If the data can legitimately present a cycle, in the end you will likely modify your query, so that it does cycle detection and avoidance, as explained here. But before doing such modification, you would likely prefer to first understand how the cycle forms, which table columns are involved… Call that “curiosity” or “paranoia” 🙂

To this aim, what can MySQL do to help us debug the problem?

Initially, it couldn’t do much, at least in an easy-to-use way. But, starting from version 8.0.19, I made it allow any recursive CTE to contain a LIMIT clause. So the recursive algorithm will start its work, run iterations as usual, accumulate rows, and stop when the count of these rows exceeds the LIMIT; then the CTE will be considered complete, and no error will be emitted.
So, the first thing we now have to do when we meet error 3636 , is to add some LIMIT (10 first rows? 100 first rows? we can just try). Then get the query’s output; and by looking at these rows, we can start understanding how they are generated and why too many of them are : debugging becomes possible.

In our case:

First Mars, then Jupiter, Saturn, Earth, Mars! ok we see our cycle seems to form after 5 rows; it becomes clearer by adding, near each reached planet, the previous planet on the path:

That’s it! There’s a Saturn-to-Earth rocket, which closes the cycle – crystal clear! The debugging part of the work is done (later we can fix the query, by making it avoid cycles, as explained here).

In conclusion of this post, while LIMIT-in-CTE is perhaps not a feature that will change the face of SQL1, I believe it can save time for nearly everybody who manipulates recursive CTEs in MySQL, which is already a nice thing 🙂

And, as always, thank you for choosing MySQL.

 

1. Which is perfectly fine! I can wait one more year before becoming famous.

Featured image by bruce mars from Pexels

Leave a Reply