Skip to content
Ask Bjørn Hansen edited this page Jan 21, 2013 · 6 revisions

GeoDNS configuration format

IP based format

Originally the GeoDNS configuration (then pgeodns) revolved around the IP address that then got labels assigned, for example

# name  ip       labels
server1 10.0.0.1 www.europe www.jp www
server2 10.1.0.1 www.north-america www.jp

The first column is ignored, the IP is added for each of the labels listed after.

In this format you configure the IP addresses available and then list the labels that should point to each. For smaller configuration files this is very easy to work with.

The format only supports A records and extending it to anything but AAAA records is relatively messy and it needs a parser written to read and write the files programatically (a simple one, but still).

Label based format

To make pgeodns work for the NTP Pool project I added a more extensible JSON configuration format. Instead of being keyed off the IPs (RRs), it is keyed off the labels (really the internal data structure in pgeodns). In addition to supporting multiple record types, the format also supports a "weight" for each record so the server can return some IPs more than others.

The data part of the file goes along the lines of:

{ "www": { "a": ["10.0.0.1"] },
  "www.europe": "a": ["10.0.0.1"]},
  "www.north-america": { "a": ["10.1.0.1"]},
  "www.jp": { "a": ["10.0.0.1", "10.1.0.1"]},
}

The A records are actually in lists of lists (omitted here) to also include the weight value. Other record types uses a hash instead of a list for the data.

In this data format it's easier to look at the data and see what will be returned for a particular country; but it's very tedious to update/maintain by hand with all the repetition of data.

The code in GeoDNS (the Go version) is also a bit clumsy and related logic for finding the right records in the right label is spread across findLabels and the main serve function.

Suggested new data layout

Internally I think the code will be simpler if we store each label as it's being requested ("www") and then the targeting information inside that data structure. Since the current configuration format so closely mirrors the internal data structure, the configuration format would have to change, too, for it to work.

"data": {
	// label
	"www": {
		// targeting

		/// hash for RR parameters
		"": {"a": [ { "a": "10.0.0.1", "weight": 100} ]},

		/// old more compact A/AAAA format (IP/weight in an array)
		"europe": { "a": [ ["10.0.0.1",100] ] },
  		"north-america":{ "a": [ ["10.1.0.1",100] ] },
      	"jp":     { "a": [ ["10.0.0.1",50], ["10.1.0.1",100] ]},

		}
	}
	// "empty" label
	"": {
		// targeting
		"": {
			// record type
			"mx": [ {
				"mx": "mail",
				"preference": 10, "weight": 5
			}]
		},
		"europe": {
			{
				// mx records for users from europe
				"mx": [ {"mx": "mail-europe"},
						{"mx": "mail-europe2"}
				]
			}
		}
	}
}

In this scheme it's much easier to see "everything" related to a particular label, because no matter how the data structure is dumped or written, things related to "www" (in this example) will be close together.

A disadvantage is that "www.europe" won't automatically work for end-users. Currently because "www.europe" has to be setup for targeting, users can either use "www" (and the systems figures out to use "www.europe") or they can ask for it explicitly by requesting "www.europe". In the new format the system could automatically setup "alias" records for those with some syntax pointing to "www/europe" to target it appropriately.

The code will go through the targeting list until it finds the record type being looked for (or an alias or cname), similar to what the system does now but it will make it easier to fix some of the edge cases that are currently wrong.

The biggest disadvantage though is that this format is even more verbose and clunky to write by hand! I'm not sure sure if that outweighs the benefits.

For the NTP Pool and other similar systems that generate the data automatically the extra syntax is fine.

In any case it'd be nice if the system was easier to get started with; maybe with an extra configuration format, similar to how pgeodns supports both the text format described above and the JSON format.

Reason we don't use regular RFC 1035 zones

For the current users of the software the zones are entirely machine generated from other sources and the JSON format is very convenient.

However it's possible that we could have RFC 1035 zones work, too, for a subset of the functionality and to make it easier for people "hand typing" the configuration. Things to figure out if we try supporting RFC 1035 zones

  • Supporting weight for each record and other options (MaxHosts, others in the future)
  • Aliases ("internal CNAMEs")
  • A not-crazy way to embed targeting information in each label.