Panda Grammar

Deprecated: Please use dashkite/parse instead.

Panda Grammar is a parser combinator library for writing recursive descent parsers. What that means is that you write functions that consume input and return a value indicating what was parsed and combine these using other higher order functions. For example, you might have a function that parses a URL scheme. You could then use that function in a function that takes a sequence of such functions to parse entire URLs.

Example: URL Parser

Let's start by defining a few simple elements of a URL. These use the string and re helpers from Panda Grammar, which consume strings and regular expressions, respectively.

(Examples in CoffeeScript because I like to write in CoffeeScript, but the semantics are the same as those of JavaScript.)

separator = string "/"
word = re /^\w+/
qdelim = string "?"       # query delimiter
cdelim = string "&"       # query continuation delimeter
equal = string "="
protocol = re /^https?/
sdelim = string ":"       # scheme delimiter
root = string "//"

So far, so good. If you're familiar with the URL spec, you'll see that we're make a few simplifying assumptions, such as ignoring FTP URLs.

Next we want to build functions using these functions to parse the various parts of a URL, like the scheme and path.

scheme = all protocol, sdelim
path = all root, list separator, word

PG provides combinators, like all and list, that allow to combine the simple functions we've already defined. The only problem is that these will return nested arrays of everything that's parsed. That isn't super useful, which is why PG provides a rule combinator, so that you can transform these arrays into useful values.

scheme = rule (all protocol, sdelim),
  ({value: [protocol]}) -> {protocol}

path = rule (all root, list separator, word),
  ({value: [, components]}) -> {components, path: "/" + (components.join "/")}

Rules take a function and pass the return value to a second function that can modify it. The return value from PG function is an object containing two properties: the parsed value and the rest of the input.

For our scheme rule, we take the protocol and ignore the delimiter. For the path rule, we ignore the // and return both the path components and the reconstructed path.

Here's a rule for query parameters that returns an object based on the query.

assignment = rule (all word, equal, word),
  ({value: [key, , value]}) -> [key, value]

query = rule (all qdelim, list cdelim, assignment),
  ({value: [, pairs]}) ->
    query = {}
    query[k] = v for [k, v] in pairs
    {query}

We pull them altogether into another rule that parses and entire URL and merges the result into a single object.

url = rule (all scheme, path, (optional query)),
  ({value}) -> Object.assign value...

The finishing touch is to use the grammar combinator to define our parseURL function. The grammar helper checks to make sure that there's no input remaining to parse, returning undefined otherwise.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
src		src
tasks		tasks
test		test
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.md		LICENSE.md
README.md		README.md
genie.yaml		genie.yaml
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Panda Grammar

Example: URL Parser

License

Licenses found

pandastrike/panda-grammar

Folders and files

Latest commit

History

Repository files navigation

Panda Grammar

Example: URL Parser