JSON Labs Release: JSON Functions, Part 2 — Querying JSON Data

The MySQL 5.7.7 JSON Lab release introduces a native JSON datatype. In part 1 of this blog post series, Rick Hillegas introduced the new functions for creating and manipulating JSON documents using the new native JSON data type. In this blog post we will be using some of the same sample tables and JSON documents as in part 1, so it will be helpful to read that blog post now, if you haven’t already.

Here we look closer at the functions provided to search through and peer into JSON documents to find values inside of them, and we’ll also cover a few utility functions as we go along. The functions described here are also summarized at the end of this blog for ease of reference.

Please note that the exact set of functions and their semantics may evolve before the final 5.7 GA release, so your feedback will be very valuable as we settle upon the best set of functions!

At the end of part 1, we have a set of thermostat readings inside our JSON column within the ‘THERMOSTAT_READING’ table. Here are the contents of that table where we left off in part 1:

We can inquire what the length of the documents are, that is, how many elements they have, using the JSN_LENGTH() function:

There is also a companion function called JSN_DEPTH() that can be used to inquire about the maximum depth in a document:

In this case it reports 2 since the JSON document consists of a JSON object with only scalar values, that is has no nested objects. The depth value of 1 is reserved for a JSON scalar value.

Now, in part 1 we modified the set of reading data, by adding and removing keys. We might also want to verify or check the set of keys in the readings we have by using the the JSN_KEYS() function. This will return rows of JSON arrays containing the keys of our thermostat reading objects:

Now let’s look at another interesting function, JSN_CONTAINS_PATH(). This can be used to determine if a specific element is present in a document, for example the ‘spread’ element:

So what is the meaning of the second parameter used there: ‘all’? While not useful in this particular case, the JSN_CONTAINS_PATH() function can inquire about the presence of several elements at once, e.g. both the ‘spread’ and the the ‘on’ elements:

This time it returns 0 since we had previously removed the ‘on’ element in part 1. Passing in a value of ‘one’ instead of ‘all’ would be asking if just one of the two (OR instead of AND) elements were present. Thus would give the result of 1 in our case since they have the ‘spread’ element:

Now, we might want to inquire whether a specific element and value are present instead. For this we can use the JSN_CONTAINS() function. Let’s first construct a JSON object containing the element we are interested in using the JSN_OBJECT() constructor function in that was introduced in part 1:

In this case, we could have achieved this by simply using JSN_EXTRACT() and selecting the field value from our table instead:

ButJSN_CONTAINS() can also handle arrays and nested objects! So let’s say that we want to check if the model has the two capabilities: “fan” and “ac”. We can check that this way:

Note that the order of the capabilities is not significant. If one of the thermostat model’s features has a complex description, e.g. say the fan capability was described by a tuple:

We could then query for models that had features of “furnace” and “voltage”=110 in this way:

As you can see the query checks for the presence of a “slice” or fully contained part of the document. The contained object we searched for is a subset of both the outer array and the nested object value.

Another interesting function is the JSN_SEARCH() function. This can be used to locate scalar values inside of a document. Currently we can search for JSON string values. If found, the function will return a JSON path to the found occurrence. We can get one or all occurrences of the string depending on what we ask for (‘one’ below).

For example, let’s say we want to know where, if present, the string “furnace” can be found in the thermostat_model table, we could issue this query:

The returned values here are MySQL JSON paths into a top level array at index 1 and 3. The location of that capability differs between the two “Acme” models. The search argument can also contain SQL wild cards (default ‘%’) similar to the LIKE operator.

JSON values are typed in JSON, and MySQL has a JSN_TYPE() function to inquire about the type of a JSON value. This function returns a MySQL string. Let’s check some types of the thermostat_model data:

This makes it possible to perform an action on JSON values depending on their type, using for example a CASE test:

The current set of JSON types in MySQL is a super set of the types in the ECMA standard for JSON. We currently support NULL, DECIMAL, INTEGER, UNSIGNED INTEGER, DOUBLE, STRING, OBJECT, ARRAY, BOOLEAN, DATE, TIME, DATETIME, TIMESTAMP and OPAQUE (raw bits).

When encountering some data, presumably a JSON document, we can check if the document is a valid JSON document by using the provided JSN_VALID() function. For example, the following check yields 0 because the text uses an unquoted string value of “fan” as a value of the ‘capability’ element:

That’s also something that would be evident if we tried to parse that is a JSON document:

If we add text quotes around “fan”, however, then it will work as expected:

Finally, to wrap things up, let us look at two small but useful functions: JSN_QUOTE() and JSN_UNQUOTE(). These can provide the “glue” needed to handle the impedance mismatch between MySQL and JSON types.

When inserting data into a JSON column, we have seen that we can just embed the JSON text inside a MySQL string. As we just saw above, we can also explicitly CAST text to a JSON type. Sometimes, however, we have a MySQL string that we do not want to parse as a JSON document, but we instead want to embed it as a JSON string “as is”. Let’s compare the difference between the two JSON values inserted below to demonstrate the point:

Notice the difference here: the first value is parsed as a JSON document with structure (an array), the second is inserted as a simple JSON string scalar. Notice that the function does more than wrap text quotes around the argument: it will ensure that the resulting MySQL string is acceptable to a JSON parser by performing any needed internal quoting. Also, an embedded text quote is not an issue in an SQL string since single quotes are used as a string delimiter in SQL.

Conversely, JSN_UNQUOTE() will strip away surrounding quotes. This function can be used to turn a valid JSON string literal into an SQL string literal.

This is usually not necessary though, since MySQL tries to be helpful when doing comparisons:

Here the comparison happens in the JSON context, and the MySQL string “fred” is automatically converted to a JSON string before the comparison. We could though instead use the JSN_UNQUOTE() function to perform the comparison in the MySQL string domain in this way:

But the former is obviously easier and the latter should usually be unnecessary.

In summary, here were the functions covered:

  • JSN_TYPE()
  • JSN_KEYS()

This concludes this brief introduction to the query related JSON functions available in this lab release! More details can be found in the worklog entry: WL#7909.

We expect to further refine and polish the JSON functions before 5.7 GA, so please keep those comments coming, and please start playing with them! Your input and contributions will no doubt improve the final features. In the meantime, have fun, and please let us know what you think of these new JSON features! We’d love to hear your feedback on what else you’d like to see related to our wider JSON support. If you encounter any problems with these new features, please let us know here in the comments, open a bug report at bugs.mysql.com, or open a support ticket.

Thank you for using MySQL!

About Dag Wanvik

Dag Wanvik is a M.Sc. in computer science, NTNU, 1979. He spent half his career working on compilers, debuggers and language tools, then moved to databases and is now, with MySQL, into his third after work on Clustra and Apache Derby (committer). He has been with Sun, Oracle since 2002.

6 thoughts on “JSON Labs Release: JSON Functions, Part 2 — Querying JSON Data

  1. I jsn_array_sum() will be very helpfull – in my first use I’m trying to sumarize a customer order using $.unitPrice * $.qty, but my json have an array of itens:

    “items”: [
    “unitPrice: 1,
    “qty” 2
    }, {
    “unitPrice: 0.88,
    “qty” 1

    Than I need to creat my custom stored procedure/function to do iteration and solve this task.

    1. Yes. For now, a stored procedure is needed to do that job in the general case.
      When MySQL gets table functions, we open the way for functions to return result sets, and we can add functions like json_array_sum() as suggested.

      1. Hi,
        i Have a Json:
        {“Sexo”: “M”, “Nombre”: “Ramiro Pereyra”, “Pregunta Dos”: “RespuestaDos”, “Pregunta Uno”: “RespuestaUno”, “Pregunta Tres”: “RespuestaTres”, “Pregunta Cinco”: “RespuestaCinco”, “FechaNacimiento”: “1996-01–9”, “Pregunta Cuatro”: “RespuestaCuatro”}
        how can i know the names of the keys in separated columns?
        Sexo Nombre Pregunta Dos
        M Ramiro Pereyra Respuesta Dos

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter * Time limit is exhausted. Please reload CAPTCHA.