Taking the new MySQL 5.7 JSON features for a test drive

MySQL 5.7 introduces both a new native JSON datatype, and a set of SQL functions to be able to manipulate and search data in a very natural way on the server-side. Today I wanted to show a simple of example of these features in action using sample data from SF OpenData.

Importing Sample Data

Having good sample data is useful, because it helps you self-validate that results are accurate. It also helps provide good data distribution, which is important when adding indexes.

My chosen data set from SF OpenData is the most popular item under “Geographic Locations and Boundaries” and contains approximately 200K city lots. The first step is to download and import it into MySQL:

Here is an example what each one of the features (parcel of land) looks like:

In this case all 200K documents do follow a common format, but I should point out that this is not a requirement. JSON is schema-less 🙂

Example Queries

Query #1: Find a parcel of land on Market street, one of the main streets in San Francisco:

Using the short hand JSON_EXTRACT operator (->) I can query into a JSON column in a very natural way. The syntax "$.properties.STREET" is what we call a JSON path, and for those familiar with javascript I like to compare this to a CSS selector similar to what you would use with JQuery.

To learn more about the JSON path syntax, I recommend checking out our manual page, or this blog post by Roland Bouman.

Query #2: Find any parcels of land that do not specify a street:

With JSON being schemaless, this finds the documents which do not have the expected structure. In this example we can see that all documents have a $.properties.STREET specified, and thus the query returns zero results.

Comparing the JSON type to TEXT

In this example I am running a query which deliberately needs to access all 200K JSON documents. This could be considered a micro-benchmark, as it does not quite reflect what you would experience in production, where you will often have indexes:

To explain what is happening here in more detail:

  • For simplicity, I’ve ensured that in both examples the dataset fits in memory.
  • The JSON functions, including the short-hand json_extract() operator (->) will work on both a native JSON data type, as well TEXT/BLOB/VARCHAR data types. This is very useful because it provides a nice upgrade for users prior to MySQL 5.7 who frequently already store JSON.
  • We can see that the native JSON datatype is indeed about 10x faster than TEXT – 1.25 seconds versus 12.85 seconds. This can be explained because the native type does not have to do any parsing or validation of the data, and it can retrieve elements of a JSON document very efficiently.


Hopefully this serves as a useful example of importing sample JSON data, and running a few sample queries. In my next post I take this a step further by showing how you can index JSON data by using virtual columns.

13 thoughts on “Taking the new MySQL 5.7 JSON features for a test drive

  1. This looks absolutely amazing! Thanks a lot for the great post!
    But I have a few questions is this really production ready? Has people started using it for real yet? Or is it still in the testing phase?

  2. mysqldump and mysqlimport simply doesn’t work with JSON columns. I need to replicate data from my production server to development but the mysql import features keep raising erros when dealing with JSON columns.

    Very frustrating…

    1. Hi Hélio, mysqldump does support JSON columns. It might possibly be that you are using a slightly older client version?

      Mysqlimport is used to load text files into MySQL. It is not the corresponding command to mysqldump (use mysql < input-file.sql instead). - Morgan

  3. Hi,
    I am trying to upload a CSV file to a table that has JSON data type.
    My table structure and CSV file look like the following:

    id bigint(20) NOT NULL AUTO_INCREMENT,
    date_created datetime NOT NULL,
    last_updated datetime NOT NULL,
    device_id bigint(20) NOT NULL,
    metadata_json json DEFAULT NULL,
    PRIMARY KEY (id)

    “2016-10-26 10:00:49″,”2016-10-26 10:00:49”,1,”{“loginTime”:1477544449257,”validUser”:true,”foo”:”bar”}”

    When I try to import the CSV, I get this error.
    The JSON column in CSV is valid.

    [ERROR in row 1] Invalid JSON text: “Missing a name for object member.” at position 1 in value for column ‘DEVICE_METADATA_JSON.metadata_json’.

    Not sure what I’m doing wrong here as to me the CSV file looks valid.

Leave a Reply