Skip to content

armon/teles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Teles Build Status

Teles is an Erlang network service for manipulating geographic data. Specifically, it uses in-memory R*-trees to store and manipulate geo-spacial indexes. It can be used to quickly provide answers to geo-spacial queries that are otherwise difficult or tedious to embed within an application.

It was originally written for targetting rewards at Kiip.

Features

  • Scalable implementation allows for many connected clients and concurrent queries
  • Uses R*-trees for efficient queries
  • Simple ASCII protocol
  • Relatively fast

Install

Download and build from source::

$ git clone https://[email protected]/armon/teles.git
$ cd teles
$ make deps
$ make rel

At this point, rel/teles will contain a release build

Usage

Teles can be configured using the sys.config file. Here is an example configuration file:

[
 {teles, [
            {teles_agent_concurrency, 4},
            {teles_port, 2856},
            {teles_accept_pool, 8}
        ]}
].

Here is a brief description of the configurations:

  • teles_agent_concurrency : This controls how many duplicate copies of data is held. By having more than 1 R-tree, we can perform more queries in parallel. This trade-off is made at the cost of memory utilization. If we have 4 agents, then we can perform 4 queries at a time, and consequently use 4x the RAM. Additionally, if an agent dies, recovery can be performed using another agent to avoid data loss. Sane configurations are probably either 1 for no duplication, or the number of CPU's on the box if memory allows.

  • teles_port : The port that teles listens on

  • teles_accept_pool : The number of processes accepting new connections. Default of 8 probably does not need to be modified.

Protocol

By default, Teles will listen for TCP connections on port 2956. It uses a simple ASCII protocol that is meant to be fairly human readable, while being easy to parse.

Each command is newline, and optionally carriage return delimited:

cmd [args][\r]\n

We start each line by specifying a command, providing optional arguments, and ending the line in a newline (carriage return is optional).

There are a total of 14 commands:

  • use space - Sets the namespace of the connection
  • create space - Creates a new namespace. These are for logical seperation of data similar to a new 'database' for an RDBMS
  • delete space - Deletes a namespace
  • list spaces - Lists all available name spaces
  • in - This is a command prefix, and executes the next command in a specific namespace
  • add object - Adds a named object to a namespace. An object has an opaque name (Teles doesn't care), and can be associated with various geospacial points.
  • delete object - Deletes an object
  • list objects - Lists all the objects in the namespace
  • list associations - Lists all the points associated with an object
  • associate point - Associates an object with a Lat / Lng point
  • disassociate - Disassociations an object with a Lat / Lng point
  • query within - Queries within a search box for any objects
  • query nearest - Queries for the K nearest objects to a given Lat / Lng
  • query around - Queries around a Lat / Lng for a given distance for objects

For the use command, we only space the name of a space to use:

use cities

This will either return "Done\n" on success or will return "Space does not exist\n", if a space which is not yet created is specified.

The create space command just takes the name of a space, which can already exist, and creates it. It always returns "Done\n":

> create space states
Done

Issuing a create space also causes that space to be used. For this reason, if you want to use a space and ensure it exists, it is safe to always use create.

The delete space command is like create space but may return "Space does not exist".

The list spaces command takes no arguments but will list all the spaces in a START/END block like such:

> list spaces
START
cities
states
testing
END

This format is common to all commands that may return a list of results.

The in command is not a command by itself, but it takes a space and a command and executes teh command in a given space. For example, to add an object to the 'cities' space, it is possible to do:

> in cities add object sanfrancisco

This is an alternative do doing:

> use cities
Done

> add object sanfrancisco
Done

These are considered equivilent. However, some clients may prefer to use the in format as then a single connection can be used to issue commands to multiple spaces. It is possible to change the space of a connection at any time by doing use again.

The add object command takes an opaque name and always returns "Done\n". The object may already exist. The name has no meaning to Teles and can be an application level ID.

The delete object command takes an opaque name and returns "Done\n" if successful. If the object does not exist, then it will return "Object not found\n".

The list objects command takes no arguments but will list all the objects in the space. It will also use a START/END block:

> in cities list objects
START
sanfrancisco
portland
seattle
END

The list associations with command takes an object as an argument and lists all the associated points in a START/END block:

> in people list associations with jane
START
GID=3059837609 lat=38.8730 lng=-118.3570
GID=3277219006 lat=40.1230 lng=-120.5120
END

The results are a single row for each point (lat/lng) that the object is associated with. Objects can have a one-to-many mapping of points. For example, this could be all the places that jane has visited. The GID is the Geometry ID, and it is used to remove associations.

To first create an association, the associate point <lat> <lng> with <obj> command is used. The latitude and longitude are given as decimal floating point representations, and the object ID is also provide::

> in people associate point 38.873 -118.357 with jane
Done

If an object does not exist, then "Object not found" will be returned. If an object is already associated with the given point, nothing will change.

Removing an association is done with the disassociate command::

> in people disassociate 3277219006 with jane
Done

This will disassociate jane with the GID (Geometry ID) 3277219006. These can be found using list associations. This command can also return "Object not found\n" or "GID not associated\n"

Finally, after objects are added, queries can be done to find all matching objects. There are 3 types of queries: window, nearest, and around. The window query creates a lat/lng box and finds all points inside. Nearest takes a point and finds the K nearest neighbors. Finally, around takes a point and searches for some distance around it.

A window query is done by using query within:

> in people query within 40.123 45.451 -122.1 -120.3
START
mary
jane
joe
END

This will find all objects where latitude is between 40.123 and 45.451, and longitude is between -122.1 and -120.3.

A nearest query is done by using query nearest:

> in people query nearest 5 to 40.11 -120.3
START
courtney
grantland
adam
amadeus
jack
END

This finds the 5 nearest objects to the point 40.11 -120.3.

An around query is done using query around:

> in cities query around 40.11 -120.3 for 15mi
START
sanfrancisco
oakland
marin
marin
dalycity
END

This finds all the objects within 15 miles of 40.11 / -120.3. THe distance measure is expected to be in units of meters, however the following suffixes are understood:

  • m: Meters (1000m)
  • km: Kilometers (15km)
  • mi: Miles (5mi)
  • y: Yard (2000y)
  • ft: Feet (5000ft)

There are also various error codes that are possible. If arguments are bad, the server can return: "Client Error: Bad arguments\n". If latitude or longitude is bad, then "Client Error: Bad lat/lng format\n" may be returned. Lastly commands that require a namespace (either through use of in or use space) will return "Client Error: Must use a namespace\n" if one is not provided.

Example

Here is an example of a client flow, assuming teles is running on the default port using just telnet::

$ telnet localhost 2856
> list spaces
START
END

> create space testing
Done

> list spaces
START
testing
END

> list objects
START
END

> add object jill
Done

> add object jack
Done

> add object joe
Done

> delete object joe
Done

> list objects
START
jack
jill
END

> associate point 40.123 -120.515 with jack
Done

> associate point 40.123 -120.515 with jill
Done

> list associations with jack
START
GID=2192486008 lat=40.1230 lng=-120.5150
END

> disassociate 2192486008 with jack
Done

> list associations with jack
START
END

> query nearest 5 to 40.123 -120.515
START
jill
END

> delete space testing
Done

Clients

Here is a list of known client implementations:

Here is a list of "best-practices" for client implementations:

  • Maintain a set of open connections to the server to minimize connection time
  • Doing a 'use space' at the start of the connection avoids the need to specify 'in ' before each connection

Performance

Casual testing was performed on a 2012 Macbook Pro with default configurations, and a concurrency of 4 per space. For a space with 370,000 objects, average insert time was 500μs, resulting in an total of 2000 inserts per second. Performing query around with a random lat/lng and random distance between 1 and 100 miles took 300μs but with mutilple clients about 6000 QPS can be achieved. Lastly, doing query nearest with a random lat/lng and between 1 and 50 neighbors took 1.6ms on average, and with 4 clients 2400 QPS was possible.

References

Packages:

Related works:

About

An Erlang network service for manipulating geographic data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published