Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Slideshows
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MySQL

Posted June 8, 2015

MySQL: Calculating Distance Based on Latitude and Longitude

By Rob Gravelle

With the proliferation of mobile devices being at an all-time high, there has never been more demand for “Find near me” type apps.  As a result, just when you thought you were finally free of trigonometric functions like sine and cosine, you find out that they were useful after all!  In fact, it turns out that they can be instrumental in calculating the distance between two points.  We’ll be using them today to fetch a list of nearby restaurants sorted by distance using nothing but the power of MySQL!

The Haversine Formula

The Greek mathematician Pythagoras developed his theorem for calculating the shortest distance between points way back in 5th-century B.C. 

distance = sqrt((X2 - X1)^2 + (Y2 - Y1)^2)

That formula would work perfectly for distance calculations as well, if only the Earth were flat (some say it is)!  Due to the curvature of the Earth, Pythagoras’s Theorem does not lend itself so well to geolocation calculations.

Image source: Wikipedia
Image source: Wikipedia

A more precise result may be obtained using the Haversine Formula.  It takes the radius of the Earth into account.  It’s the R variable below:

dlon = lon2 - lon1 
 dlat = lat2 - lat1 
 a = (sin(dlat/2))^2 + cos(lat1) * cos(lat2) * (sin(dlon/2))^2 
 c = 2 * atan2( sqrt(a), sqrt(1-a) ) 
 distance = R * c (where R is the radius of the Earth)
R = 6367 km OR 3956 mi

Since the earth is not a perfect sphere, using a fixed value for the Radius is still not ideal, but good enough for most applications. The Earth is actually an oblate spheroid and not perfectly spherical (it bulges at the equator), so the above formula will tend to overestimate trans-polar distances and underestimate trans-equatorial distances.  

Applying the Haversine Formula to MySQL

Calculating the distance between two locations using the Haversine Formula in MySQL requires us to call upon several of MySQL’s built-in Math functions, including  cos(), sin(), acos(), and radians().  In case you were wondering like I was, the radians() function converts the value of a number from degrees to radians where pi radians equals 180 degrees.

3959 * acos( cos( radians(lat1) ) 
      * cos( radians(lat2) ) 
      * cos( radians(lon2) - radians(lon1)) + sin(radians(lat1))
      * sin( radians(lat2) )

Incorporating the Haversine Formula into a Stored Procedure

For ease of use, I would recommend that you place the above formula in a stored procedure (a.k.a. a proc).  Our sample proc searches a WordPress database for restaurants that are located within a given distance from the user.  For the latitude and longitude parameters, I chose to use the Decimal type with six digits of precision.  You may choose to adjust this for your needs. In addition to the latitude and longitude values, this proc accepts input parameters for the units of measurement (miles or kilometers), the maximum distance to search, and the maximum number of results to return.

CREATE DEFINER=`root`@`localhost` PROCEDURE `closest_restaurants`
   (IN units varchar(5), IN lat Decimal(9,6), IN lon Decimal(9,6), 
    IN max_distance SMALLINT, IN limit_rows MEDIUMINT)
 BEGIN
   SELECT pm1.post_id, 
          p.post_title,
          ROUND((CASE units WHEN 'miles' 
             THEN 3959
             ELSE 6371 
           END * acos( cos( radians(lat) ) 
               * cos( radians(pm1.meta_value) ) 
               * cos( radians(pm2.meta_value) - radians(lon)) + sin(radians(lat)) 
               * sin( radians(pm1.meta_value)))
          ), 3) AS distance
   FROM goodfood_wp_md20m_postmeta AS pm1, 
        goodfood_wp_md20m_postmeta AS pm2,
        goodfood_wp_md20m_posts    AS p 
   WHERE pm1.meta_key = 'latitude' AND pm2.meta_key = 'longitude'
     AND pm1.post_id = pm2.post_id
     AND pm1.post_id = p.id
     AND p.post_status = 'publish'
   HAVING distance <= max_distance
   ORDER BY distance ASC
   LIMIT limit_rows;
 END

Calling the Proc

Here is a search for the ten closest restaurants to the center of London, UK, measured in miles, within a hundred mile radius:

CALL `restaurants`.`closest_restaurants`('miles', 51.5112139, -0.119824, 100, 10);

post_id

post_title

distance

2103

Pret A Manger - 135 Strand

0.041

2288

Pret A Manger - 87-88 Strand

0.093

2093

Pret A Manger - 421/422 Strand

0.182

2095

Pret A Manger - 29-33 Kingsway

0.212

2139

Pret A Manger - 65 Long Acre

0.230

7075

Nando's - Covent Garden

0.236

2146

Pret A Manger - 182 Strand

0.239

1324

Carluccio's - Covent Garden

0.256

2088

Pret A Manger - 25 Villiers Street

0.267

2305

Pret A Manger - 37 Shelton Street

0.291

As evidenced by the proc output, the center of London is saturated by fresh food chains like Pret A Manger, Nando’s Chicken, and Carluccio’s.  Just look at the Pret A Manger shops that come up in Google Maps:

Google Map Screenshot

Optimizing for Speed

There are those who feel that the database is not the best place to put these types of intensive calculations.  I have no strong opinions either way, but I have to admit that with an average execution time of about 2.6 seconds on a local database, the search is not as fast as I would like.

One way to speed things up considerably is to define a box around the user’s coordinates whose size is equal to the max_distance parameter.  Each degree of latitude is approximately 69 miles (111 kilometers) apart.  Here’s why: the circumference of the earth along the equator is roughly 24,901.92 miles and there are 360 degrees in a circle. Therefore, if we divide 24,901.92 miles by 360 degrees, we get 69.172 miles per degree.  There is also a formula for extrapolating one degree of longitude based on the latitude:

1° of latitude ~= 69 miles
1° of longitude ~= cos(latitude)*69

Thus, to calculate the longitude and latitude for the rectangle in MySQL we would add the following to our proc:

DECLARE lon1, lon2, lat1, lat2 float;
 SET lon1 = lon-max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT);
 SET lon2 = lon+max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT);
 SET lat1 = lat-(max_distance/ONE_DEGREE_CONSTANT);
 SET lat2 = lat+(max_distance/ONE_DEGREE_CONSTANT);

Then, we would replace the “HAVING distance <= max_distance” in the WHERE clause with the following two lines:

AND pm2.meta_value between lon1 and lon2 
 AND pm1.meta_value between lat1 and lat2

Here is the full proc:

CREATE DEFINER=`root`@`localhost` PROCEDURE `closest_restaurants_optimized`
   (IN units varchar(5), IN lat Decimal(9,6), IN lon Decimal(9,6), 
    IN max_distance SMALLINT, IN limit_rows MEDIUMINT)
 BEGIN
   DECLARE ONE_DEGREE_CONSTANT TINYINT;
   DECLARE EARTH_RADIUS_CONSTANT SMALLINT;
   DECLARE lon1, lon2, lat1, lat2 float;  
  IF units = 'miles' THEN
     SET ONE_DEGREE_CONSTANT   = 69;
     SET EARTH_RADIUS_CONSTANT = 3959;
   ELSE -- default to kilometers
     SET ONE_DEGREE_CONSTANT   = 111;
     SET EARTH_RADIUS_CONSTANT = 6371;
   END IF;  
  SET lon1 = lon-max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT);
   SET lon2 = lon+max_distance/abs(cos(radians(lat))*ONE_DEGREE_CONSTANT);
   SET lat1 = lat-(max_distance/ONE_DEGREE_CONSTANT);
   SET lat2 = lat+(max_distance/ONE_DEGREE_CONSTANT);
  SELECT pm1.post_id, 
          p.post_title,
          ROUND((EARTH_RADIUS_CONSTANT * acos( cos( radians(lat) ) 
               * cos( radians(pm1.meta_value) ) 
               * cos( radians(pm2.meta_value) - radians(lon)) + sin(radians(lat)) 
               * sin( radians(pm1.meta_value)))
          ), 3) AS distance
   FROM goodfood_wp_md20m_postmeta AS pm1, 
        goodfood_wp_md20m_postmeta AS pm2,
        goodfood_wp_md20m_posts    AS p 
   WHERE pm1.meta_key = 'latitude' AND pm2.meta_key = 'longitude'
     AND pm1.post_id = pm2.post_id
     AND pm1.post_id = p.id
     AND p.post_status = 'publish'
   AND pm2.meta_value between lon1 and lon2 
   AND pm1.meta_value between lat1 and lat2
   ORDER BY distance ASC
   LIMIT limit_rows;
 END

The optimized version ran at a far more respectable 0.078 seconds on average.

 

Conclusion

The part of your systems to which you assign calculation duties depends on your particular environment.  If you’re using MySQL as the backend of a multi-tier enterprise system, perhaps the programming code can perform the calculations faster.  As we saw here today, MySQL is more than capable of performing complex numeric calculations, provided that you take the time to optimize them.

See all articles by Rob Gravelle



MySQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MySQL Forum
Topic By Replies Updated
MySQL in high availability and transction secure banking application klamor 1 August 28th, 10:24 AM
MySQL rollback UAL225 1 August 28th, 10:15 AM
Browsing a DB file that uses MySql finleytech 1 July 26th, 10:51 AM
php cookie won't pass variable to next page ITdevGirl 0 June 22nd, 12:13 PM