MySQL 5.7 and GIS, an Example

This post will offer a very simple demonstration of how you can use MySQL 5.7 for Spatial features within your applications. In order to demonstrate this, I’ll walk through how we can determine the 10 closest Thai restaurants to a particular location. For this example, we’ll be using the apartment that I lived in when I first started working at MySQL, back in 2003.

For more details on all of the new GIS related work that we’ve done in MySQL 5.7, please read through these blog posts from the developers:


Installing MySQL
Let’s first install the MySQL 5.7 labs release that contains all of the refactored GIS functions, as well as the R-tree based InnoDB Spatial Indexes. (UPDATE: these features were included in the 5.7.5 DMR, so best to simply install the latest 5.7 release). For the first time in MySQL’s history, you can now use an ACID/MVCC engine for geospatial data! It’s a bit hard for me to contain my excitement on that particular point. 🙂

Importing Open Street Map Data
We can now use a customized version of the Open Street Map (OSM) MySQL data import scripts, which will additionally generate a geometry column from the “longitude,latitude” coordinate pairs, and create a new InnoDB spatial index on it.

You can download a copy of my customized scripts here.

We can then use these scripts to load a NYC area OSM data extract, using these steps:

The resulting nodes table looks like this after the initial data load:

You can read more about playing with the resulting OSM data here.

Now we can also de-normalize the data a bit in order to avoid unnecessary and heavy subquery usage, and instead make use of the new (in MySQL 5.6) fulltext indexing feature in InnoDB. To do that, we can load all of the nodetags into a single column for each node this way:

You can download a dump of the final schema that I ended up with here.

Creating Distance Calculation Functions
Next we can create a new function to calculate the geographical distance between two points (“longitude,latitude” coordinate pairs, or nodes) using the spherical law of cosines (SLC) formula for determining the “Great-circle distance”:

This is necessary because with Geometry types, the ST_distance and related functions only return degrees—which isn’t really helpful for us here—as they’re dealing with points in a 2 dimensional cartesian plane (euclidian geometry) and we have to manually calculate these distances on an earthlike spheriod. When MySQL has Geography type support, then the need for haversine or SLC functions will go away and we can simply use the ST_distance and similar functions to calculate the actual distance on the surface of the earth, between two geographic points.

Creating the Bounding Rectangle for Our Spatial Index
Again, as a starting point in our search for Thai restaurants, I’m going to be using the location of my apartment (40.716743,-73.951368) at the time that I first started working at MySQL.

For creating the envelope or bounding box around that location—so that we can take advantage of the spatial index on it—we can use the average distance between longitude and latitude degrees of 111km. It’s pretty close for latitude, but much more variant for longitude. If you want the bounding box or envelope to be more accurate, you can use abs(cos(radians(${latitude}))*111) for the longitude calculations. In our case though, we’re only using it for an envelope to push down to the spatial index (R-tree) for the Minimum Bounding Rectangle (MBR), so we could instead just increase the size of the envelope—since we’re still calculating the actual distance later (using the SLC function), which is what we care about in the end here—and keep things simple. Thus the calculation would be something like:

Or, if you want to have a more precise envelope/MBR:

Our Final Spatial Query
So now let’s use that (simpler) MBR formula along with our SLC function to try and find out what the 10 closest Thai restaurants are to my old place, using the spatial index to weed out anything that’s not within an envelope that covers approximately a 20km radius around it:

OK, we got some values, and thanks to a combination of the spatial index and the full text index, we got them very quickly!

Examining the Results
OK, now on to the data. Let’s assume that we simply want the closest one, and examine that record:

So it’s about 600 meters from my old place, its name is Tai Thai, and its phone number is 718-599-5556. Let’s do a quick sanity check using google maps: directions, and location info. As we can see, our data matches up pretty well with Google maps (which I consider to be the gold standard today).

Now I know that there are many other Thai restaurants in the area, but they simply don’t have any records in OSM currently (at least in the extracted snapshot for the NYC area that I’m working with). As always, your data is key. 🙂

Presenting the Data
Now we know what our data is telling us are the closest 10 Thai restaurants. We also have relevent information to present to the application user about each place: name, address, phone number, website, etc. So let’s look at a very simple example of how we can push these new locations out to google maps so that we can see each one on a map of the city, to go along with all of the metadata that we have for them. The simplest way is to feed the “latitude, longitude” coordinate pairs into google maps. Let’s try generating a URL to map my old place to the next closest Thai place. Again, the “latitude, longitude” coordinates for my old apartment are “40.716743, -73.951368” and the coordinates for the next closest Thai place, One More Thai, are “40.7210541, -73.983871”. We can plug these into a simple GET call like this:,+-73.951368/40.7210541,+-73.983871. Or we can use the embedded API like this:

As you can see it’s a pretty easy GET call either way, passing in the coordinate pairs in the given format. You can read more about the google maps APIs here.

Hopefully this has opened your eyes a bit to what’s possible with MySQL 5.7 in the GIS world, and whet your appetite for jumping in and playing with it!

Lastly, a big thank you to all of the developers who worked so hard on these new features! Please stay tuned in here for additional features that truly make MySQL a leading player in the geospatial market.

UPDATE: We now have support for ST_Distance_Sphere() and ST_MakeEnvelope(), which makes things much simpler! The final query would then instead look like this in 5.7.6+:

10 thoughts on “MySQL 5.7 and GIS, an Example

  1. Yeah, it’s great!
    Since it use some Mathematical funcation, I wanna to know if these new features could work well even though there are 200 million data approximately. In other words, do you have some good suggestions for processing huge data of GIS on Mysql?

  2. Just for clarity, “20/111” expression is not true for longitude, because longitude’s degree “length” in kilometers depends on latitude. 110-111 kilometers per degree would be on equator only.

    1. Correct. I already noted this in the relevant section and the linked information (
      “… we can use the average distance between longitude and latitude degrees of 111km. It’s pretty close for latitude, but much more variant for longitude….”. You don’t necessarily need a very high degree of accuracy for the search bounding box–a bigger bounding box can make the query slower but won’t produce incorrect results. And as generally noted, it’s a quick and easy approximation for use in the demonstration. I also showed how you can make a more accurate bounding box if you want greater accuracy. See the “Or, if you want to have a more precise envelope/MBR…” section.

  3. Latitude is a decimal number between -90.0 and 90.0.
    Longitude is a decimal number between -180.0 and 180.0.
    What happens when your bounding box values wrap around because you substract from -90/-180 or add to 90/180 ?

    1. Before MySQL 8, this would have to be handled at the application layer. In MySQL 8, our GIS supports Geography and Spatial Reference Systems with ellipsoidal calculations (for ellipsoidal SRIDs). So in MySQL 8 things will become much simpler for the user, and the results much more accurate–you can simply use ST_Distance(geom1, geom2) if the geometries are using an ellipsoidal SRID (e.g. 4326 for WGS84). If you’re interested in MySQL and GIS, MySQL 8 is a monumental leap forward! 🙂

Leave a Reply