Proposal to Change the Behavior of JSON_MERGE

In MySQL 5.7, one of the most popular new features is the introduction of JSON:

  1. The JSON data type
  2. A set of 20 functions to search, manipulate and create JSON documents
  3. Virtual Columns for JSON indexing

In MySQL 8.0 we plan to extend this functionality (for example: with the addition of JSON aggregate functions), but we are also considering changing the behaviour of one of the existing functions.

The JSON_MERGE Function

In MySQL 5.7 the JSON merge function has the following semantics when two objects are merged with overlapping values:

That is to say that 'a' was converted to an array with both values present.

We have received feedback from a number of users that a merge function would be more useful if it were to instead use precedence of last value wins:

This is consistent with several scripting languages. For example in PHP there is an array_merge function where the last value takes precedence. The JSON_MERGE function in MySQL is more similar to PHP’s array_merge_recursive:

Duplicate (key) Names

On a related point, the JavaScript Object Notation (JSON) Data Interchange Format (RFC7159) says that:

The names within an object SHOULD be unique.

That is to say that duplicates are not supposed to happen (but not that duplicates must not be present). A lot is left up to the implementation on how to handle the non-unique names. In MySQL 5.7 the current behavior is to use the first key:

To use PHP as an example again, it will use the last key:

Proposal to Change Behavior

We are considering redefining the behavior of JSON_MERGE to be more consistent with the expected behaviour, while retaining the current functionality under a different function name (i.e. JSON_MERGE_ARRAY or JSON_MERGE_RECURSIVE).  In addition, we are also considering the behavior of duplicate key names.

Redefining functionality makes upgrades harder, so it is a decision that we would like external feedback on before moving forward on.

Please leave a comment, or get in touch!


6 thoughts on “Proposal to Change the Behavior of JSON_MERGE

  1. Yes, fix it now, before another major release. That way the number of people affected will be minimized.

    I want to ask if there’s a way to preserve the old behavior for anyone who has started to depend on it. But I suspect that the number of people who are using json_merge() is pretty small as yet, and most of those people who have started to use it would actually prefer the new proposed behavior.

  2. I’m strongly in favor of changing MySQL’s JSON semantics so that the last value wins, both in the case of JSON_MERGE and in other cases where duplicate keys are encountered. Not only does this more closely align with PHP’s behavior as you pointed out, it also aligns with JavaScript’s behavior, which I think is particularly important given that JSON’s semantics are derived from JavaScript’s.

    Incidentally, the reason the JSON spec says key names “SHOULD” be unique (and not “MUST” be unique) is because the spec language was written after there were already many JSON parser implementations that departed from the JavaScript behavior from which JSON was derived, so the spec had to tread carefully to avoid declaring those existing parsers invalid. In reality, I think the best for parsers follow JavaScript semantics and allow the last-defined key to take precedence when there are duplicate keys.

  3. As a JS developer the existing behaviour came as a surprise, so I’d welcome this proposal. For the moment I just have to work with what is there. Does anyone know of workarounds?

Leave a Reply