Skip to content

guenni68/parser_builder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ParserBuilder

ParserBuilder is a library that allows you to simply generate a resumable parser with unlimited backtracking from an easily modifiable grammar (expressed in XML).

What sets this parser library apart from others is that you don't define your parser by expressing the grammar and rules in Elixir code but rather in a single XML file. I've personally used this approach for an ABNF grammar with 460+ rules and added postprocessing, such as tagging, ignoring and replacing values, or any combination there of, as needed in the XML file instead of Elixir code.

Features

  • unlimited backtracking.
  • resumable parsing, in case that the input cannot be provided in one piece.
  • overridable rules. If some rules can only be known at runtime, you can override the rules from the grammar.
  • easily add postprocessing, such as tagging, ignoring and replacing intermediate parse results.

Usage

defmodule MyParser do
  use ParserBuilder, file: "priv/my_grammar.xml"
end

This will inject the following functions into your MyParser module:

  • parse_string(rule_name, string), calls parse_string_non_strict(rule_name, string)
  • parse_string_strict(rule_name, string)
  • parse_string_non_strict(rule_name, string)

Examples

<?xml version="1.0" encoding="UTF-8" ?>
<grammar xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
         xsi:noNamespaceSchemaLocation="../abnf.xsd">
  <rule id="simple1">
    <cs_literal value="one"/>
  </rule>
</grammar>

use the above grammar in the MyParser module and call

{:continue, cont1} = MyParse.parse_string("simple1", "on")
{:done, {:ok, ["one"], "rest"}} = cont1.("erest")

if you want the parser to fail if it does not consume the entire input use the following function instead:

{:continue, cont1} = MyParse.parse_string_strict("simple1", "on")
{:done, {:error, _reason}} = cont1.("erest")
{:done, {:ok, ["one"], ""}} = cont1.("e")

This library features the following "xmlized" abnf combinators:

  • <optional>...</optional>
  • <many>...</many>
  • <manyOne>...</manyOne>
  • <oneOf><item>...</item>...</oneOf> (backtracking included)
  • <atMost count="3">...</atMost>
  • <atLeast count="2">...</atLeast>
  • <exactly count="2">...</exactly>
  • <repeat min="1" max="5">...</repeat>

and the following "primitives":

  • <cs_literal value="some"/> (case sensitive)
  • <ci_literal value="else"/> (case insensitive, explicit)
  • <literal value="else"/> (case insensitive, implicit)

and a number of result aggregation features inspired by and shamelessly copied from nimble_parsec:

  • <tag name="tag1">...</tag>
  • <ignore>...</ignore>
  • <replace value="replacement">...</replacement>
  • <untagAndFlatten>...</untagAndFlatten>

Named rules from the top level are referenced like this:

  • <ruleRef uri="nameOfTheRule"/>

For a more comprehensive demonstration of this library's features please have a look at the tests.

The schema file for this version can be found here: abnf.xsd

Installation

If available in Hex, the package can be installed by adding parser_builder to your list of dependencies in mix.exs:

def deps do
  [
    {:parser_builder, "~> 1.0.0"}
  ]
end

Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/parser_builder.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published