PostGIS vs. Geocoder in Rails
This article sets out to compare PostGIS in Rails with Geocoder and to highlight a couple of the areas where you'll want to (or need to) reach for one over the other. I will also present some of the terminology and libraries that I found along the way of working on this project and article as I set out to understand PostGIS better and how it is integrated with Rails.
If you are interested in learning how to work with geospatial data with PostGIS in Django I recommend having a look at our blog post Using GeoDjango and PostGIS in Django here.
- Installing PostGIS
- ActiveRecord PostGIS Adapter
- Our Example Data
- Building a Geo Helper Class with PostGIS
- Finding Nearby Records with PostGIS and Geocoder
- Finding Records Within a Bounding Box with PostGIS and Geocoder
- Finding Records Within a Polygon with PostGIS and Geocoder
- Finding Nearby Related Records with PostGIS and Geocoder
- Conclusion
- About the Author
Picture via Annie Spratt on Unsplash
I have built a number of Rails applications over the years that show locations on a map, have nearby search functionality, and I had never used PostGIS before! How was this possible? The reason is that there is a Ruby gem named Geocoder which enables you to do these sorts of queries, and it's quite efficient! That said, there is a reason that PostGIS exists. For more complex geo queries I’d recommend reaching beyond Geocoder to PostGIS.
As an example, if you wanted to find homes which have a school within 1km of them, or if you wanted to draw an oddly shaped polygon on a map and search within it, this is the world where PostGIS shines and makes these complex geo queries possible.
In this article we will be covering:
- PostGIS in Rails setup
- Finding nearby records (Geocoder + PostGIS)
- Finding records within a bounding box (Geocoder + PostGIS)
- Finding records within a polygon (PostGIS)
- Finding nearby related records (PostGIS)
The source code referenced in this article can be found here.
Installing PostGIS
Postgres comes with a number of built-in extensions that you can enable, but unfortunately PostGIS (Spatial and Geographic objects for Postgres) isn't one of them. In order to enable this extension, you will have to use a Postgres install with PostGIS support. I recommend using the official postgis docker image, but luckily many hosted Postgres solutions come with PostGIS already available. If you are not sure, you can query the available extensions with the following query:
select *
from pg_available_extensions
where name like '%postgis%'
If you'd like to see if the extension is already enabled, you can run this query:
select * from pg_extension
And finally, to enable this extension, you can use the command create extension postgis
, but since we're working within Rails, there is a Gem that will take care of this step for us as we'll see below.
ActiveRecord PostGIS Adapter
If you have confirmed that your version of Postgres supports the postgis
extension, you're ready to integrate it with your Rails application. This can be done by using the activerecord-postgis-adapter gem. Two things need to be done to get up and running. The first is to update the adapter
within config/database.yml
to be set to postgis
. Next, if this is a new application, you can run rails db:create
as normal, but if it is an existing one, you'll have to run the command rake db:gis:setup
. This command is enabling the postgis extension in your database.
Our Example Data
We'll be working with sample data for a realtor website that allows us to find homes in a variety of ways, including homes that are nearby a local school. There are two models: homes
and schools
. The Rails migration to create these tables is below:
class CreateHomes < ActiveRecord::Migration[6.0]
def change
create_table :homes do |t|
t.string :name, null: false
t.string :status, null: false
t.bigint :price, null: false
t.integer :beds, null: false, default: 0
t.integer :baths, null: false, default: 0
t.st_point :coords, null: false, geographic: true
t.float :longitude, null: false
t.float :latitude, null: false
t.timestamps
t.index :coords, using: :gist
t.index %i[latitude longitude]
t.index :status
t.index :price
end
create_table :schools do |t|
t.st_point :coords, null: false, geographic: true
t.index :coords, using: :gist
t.timestamps
end
end
end
By using activerecord-postgis-adapter
we are able to define PostGIS columns within our migration file. When working with PostGIS you can store a point (latitude + longitude) as a single column of type ts_point
, whereas when working with Geocoder the latitude and longitude are stored as floats in separate columns. Because we are comparing the two approaches, we will store the data both ways, but typically you would choose one approach or the other.
PostGIS geographic columns can be indexed using GiST style indexes. GiST indexes are required over B-Tree indexes when working with geographic data because coordinates cannot be easily sorted along a single axis (such as numbers, letters, dates, etc...) in a way that would allow the database to speed up common geographic operations.
The example project for this article contains a seeds file (run with rake db:seed
) which will generate 100k homes and 100 schools in and around the Atlanta, Georgia area.
Building a Geo Helper Class with PostGIS
The Rails PostGIS adapter is based on a library named RGeo, which while incredibly powerful, I found a little bit confusing due to a lack of documentation. I ended up building a small helper class to generate different geo objects for me. The first thing to point out is what SRID is. Just like the imperial and metric systems are used to measure and weigh amounts using an agreed upon reference point, coordinates also need a coordinate reference system to ensure that the latitude and longitude that one uses means the same thing to different people when referring to a single place on earth. 4326 is the spatial system used for GPS satellite navigation systems and the one we will be using within this article.
One last thing to define is what WKT is. Well-known Text representation of geometry is a string representation of a point, line string, and polygon (among other things) that we will be using in our examples in this article. This is the format Postgres (PostGIS) receives and displays geographic data types in.
class Geo
SRID = 4326
def self.factory
@@factory ||= RGeo::Geographic.spherical_factory(srid: SRID)
end
def self.pairs_to_points(pairs)
pairs.map { |pair| point(pair[0], pair[1]) }
end
def self.point(longitude, latitude)
factory.point(longitude, latitude)
end
def self.line_string(points)
factory.line_string(points)
end
def self.polygon(points)
line = line_string(points)
factory.polygon(line)
end
def self.to_wkt(feature)
"srid=#{SRID};#{feature}"
end
end
Finding Nearby Records with PostGIS and Geocoder
One of the most common geo queries used in applications is to find all records within X distance from a known point (the user's location, an event, a search, etc...). Because we installed Geocoder
and added reverse_geocoded_by :latitude, :longitude
to our Home
class, we can use the nearby
method to find all homes within 5km of this latitude and longitude (which happens to be Atlanta, Georgia). Geocoder likes to have arrays with latitude and then longitude, as opposed to PostGIS which prefers the exact opposite order!
Home.near([33.753746, -84.386330], 5).count(:all) # ~5ms
This query ran in about 5ms on my computer (searching through 100k records)... pretty fast! The reason it is fast is because we added an index on the latitude and longitude fields, but also because Geocoder applies a bounding box filter which utilises the index. Remember the Spatial Reference System (SRID) that we mentioned above? Because our coordinates do not take place on a Cartesian plane, we can’t use a standard distance formula to calculate the distance between two points. Although we won’t venture further into the math of this query below, it takes into consideration the Earth’s spherical nature when calculating the distance between two coordinates as specified by latitude and longitude. This article dives into more detail on these calculations if you are interested.
SELECT COUNT(*) FROM "homes" WHERE (homes.latitude BETWEEN 33.708779919704064 AND 33.798712080295935 AND homes.longitude BETWEEN -84.44041260768655 AND -84.33224739231345 AND (6371.0 * 2 * ASIN(SQRT(POWER(SIN((33.753746 - homes.latitude) * PI() / 180 / 2), 2) + COS(33.753746 * PI() / 180) * COS(homes.latitude * PI() / 180) * POWER(SIN((-84.38633 - homes.longitude) * PI() / 180 / 2), 2)))) BETWEEN 0.0 AND 5)
We'll have to build our own near
query when working with PostGIS, but don't worry, it's pretty straight forward! The g_near
method lives within the Home
model, and takes advantage of the ST_DWithin function provided by PostGIS. Remember that we have to convert our point into the correct WKT format so that PostGIS understands the data we are passing it.
def self.g_near(point, distance)
where(
'ST_DWithin(coords, :point, :distance)',
{ point: Geo.to_wkt(point), distance: distance * 1000 } # wants meters not kms
)
end
Home.g_near(Geo.point(-84.386330, 33.753746), 5).count # ~5ms
This query performs just about as fast as the Geocoder version (because of our GiST index on the coords
column), but is definitely a little easier on the eyes to read.
SELECT COUNT(*) FROM "homes" WHERE (ST_DWithin(coords, 'srid=4326;POINT (-84.38633 33.753746)', 5000))
Finding Records Within a Bounding Box with PostGIS and Geocoder
Geocoder provides us a way to find all records within a bounding box (roughly a rectangle, ignoring projection onto a sphere), and we just have to pass it the bottom left (south west) and top right (north east) coordinates.
Home.within_bounding_box(
[33.7250057553, -84.4224209302],
[33.774350796, -84.3570139222]
).count # ~5ms
Because it can use the index on latitude and longitude, it is quite efficient.
SELECT COUNT(*) FROM "homes" WHERE (homes.latitude BETWEEN 33.7250057553 AND 33.774350796 AND homes.longitude BETWEEN -84.4224209302 AND -84.3570139222)
To perform a bounding box query using PostGis, we'll create a method named g_within_box
inside of the Home
model, and utilize a PostGIS function named ST_MakeEnvelope along with the &&
operator.
def self.g_within_box(sw_point, ne_point)
where(
"coords && ST_MakeEnvelope(:sw_lon, :sw_lat, :ne_lon, :ne_lat, #{
Geo::SRID
})",
{
sw_lon: sw_point.longitude,
sw_lat: sw_point.latitude,
ne_lon: ne_point.longitude,
ne_lat: ne_point.latitude
}
)
end
Home.g_within_box(
Geo.point(-84.4224209302, 33.7250057553),
Geo.point(-84.3570139222, 33.774350796)
).count # ~5ms
Again, this version performs at about the same efficiency as the Geocoder version.
SELECT COUNT(*) FROM "homes" WHERE (coords && ST_MakeEnvelope(-84.4224209302, 33.7250057553, -84.3570139222, 33.774350796, 4326))
Finding Records Within a Polygon with PostGIS and Geocoder
We're now into territory that requires PostGIS. To find records inside of a polygon, along with the help of our Geo
class helper and the ST_Covers function from PostGIS, we can create a method named g_within_polygon
in our Home
model. This polygon is a triangle, where the last point is the same as the first one, thereby "closing" the shape of the polygon.
def self.g_within_polygon(points)
polygon = Geo.polygon(points)
where('ST_Covers(:polygon, coords)', polygon: Geo.to_wkt(polygon))
end
Home.g_within_polygon(
Geo.pairs_to_points(
[
[-84.39731626974567, 33.75570358345219],
[-84.33139830099567, 33.86524376001825],
[-84.25243406759724, 33.770545357734925],
[-84.39731626974567, 33.75570358345219]
]
)
).count # ~5ms
This query remains efficient due to the use of our GiST index, searching through 100k records in about 5ms.
SELECT COUNT(*) FROM "homes" WHERE (ST_Covers('srid=4326;POLYGON ((-84.39731626974567 33.75570358345219, -84.33139830099567 33.86524376001825, -84.25243406759724 33.770545357734925, -84.39731626974567 33.75570358345219))', coords))
Finding Nearby Related Records with PostGIS and Geocoder
Using PostGIS it is also possible to find related nearby records. What do I mean by that? Let's try to find available
homes that are within 1km of a school. This can be done by joining to the schools
table and utilizing ST_DWithin for the on
clause. Starting with the SQL we'd like to produce:
SELECT
count(DISTINCT homes.id)
FROM
homes
INNER JOIN schools ON ST_DWithin (homes.coords, schools.coords, 1000)
WHERE
homes.status = 'available'
Within the Home
model of our Rails application, we can create two scopes that allow us to find these homes. We're able to join 100k homes to the schools table (100 schools) based on their proximity in approximately 16ms.
class Home < ApplicationRecord
scope :available, -> { where(status: 'available') }
scope :near_school,
lambda {
select('DISTINCT ON (homes.id) homes.*').joins(
'INNER JOIN schools ON ST_DWithin (homes.coords, schools.coords, 1000)'
)
}
end
# Example using the scopes declared above
Home.available.near_school.count('distinct homes.id') # 16ms
Conclusion
We've only scratched the surface of what you can do with PostGIS, yet we were able to cover a ton of functionality that is common among websites that allow you to filter results based on their location. That said, if PostGIS isn't available as an extension on your version of Postgres, or you aren't requiring the power that PostGIS provides, Geocoder offers you a great alternative.
Share this article: If you liked this article you might want to tweet it to your peers.
About the Author
Leigh Halliday is a guest author for the pganalyze blog. He is a developer based out of Canada who works at FlipGive as a full-stack developer. He writes about Ruby and React on his blog and publishes React tutorials on YouTube.