Zorex blurs the line between regex engine and advanced parsing algorithms used to parse programming languages.
With the most powerful of regex engines today, you can't parse HTML (a context-free language) or XML (a context-sensitive language), but you can with Zorex.
Under heavy development, not ready for use currently. Follow me on Twitter for updates.
Behind the scenes, Zorex parses a small DSL (the "zorex syntax", a regex-like syntax that enables opt-in EBNF-like syntax) and then at runtime builds a parser specifically for your input grammar.
It's a bit like a traditional parser generator, but done at runtime (instead of through code generation) and with a deep level of syntactic compatibility with traditional regex engines.
It uses an optimized GLL parser combinator framework called Combn to support parsing some of the most complex languages, including left-and-right recursive context-free and some context-sensitive languages, in a fast way.
Technically, Zorex is "an advanced pattern matching engine", and it is arguably incorrect to call it a regular expression engine because regular expressions by nature cannot parse non-regular languages (such as HTML).
Any regex engine that supports backtracking, however, is also "not a regular expression engine", as the author of Perl's regex engine Larry Wall puts it:
“Regular expressions” […] are only marginally related to real regular expressions. Nevertheless, the term has grown with the capabilities of our pattern matching engines, so I’m not going to try to fight linguistic necessity here. I will, however, generally call them “regexes” (or “regexen”, when I’m in an Anglo-Saxon mood).
Since the aim of Zorex is to maintain a deep level of syntactical compatibility with other regex engines people are familiar with, and further extend that to support parsing more complex non-regular languages, we call Zorex a regex engine.