The Marpa parser

This is Jeffrey Kegler's website for Marpa, a parsing algorithm

Resources

The official Marpa starting page.

Google+

The Ocean of Awareness blog: home page, chronological index, and annotated index.

Marpa::R2 (active distribution): CPAN | MetaCPAN

Mailing List

Github repository

Jeffrey Kegler's personal website

Marpa is a parsing algorithm. It is new, but very much based on earlier work by Jay Earley, Joop Leo, John Aycock and R. Nigel Horspool. Marpa is intended to replace, and to go well beyond, recursive descent and the yacc family of parsers.

Learning about Marpa

What you are looking at is the web site maintained by the author of Marpa (Jeffrey Kegler). It is NOT the best page for starting to learn about Marpa. Good places to do that are:

Other Marpa resources

Discussion of Marpa currently centers around the "marpa parser" Google Group and the IRC channel: #marpa on irc.libera.chat.

Most of the posts on Ocean of Awareness, my blog, are about Marpa. To get oriented in my blog, start at its annotated list of the most interesting Marpa posts.

If you are interested in tutorials,

Theory

For those interested in the mathematics behind Marpa, I have a paper on arxiv.org with pseudocode, and proofs of correctness and of my complexity claims.

Marpa internals

Libmarpa is a C library, and is the core of Marpa.

Quibbles

I mentioned above that Marpa parses unambiguous grammars in linear time, with a couple of exceptions, and claimed that those were unlikely to be bothersome in practice. Here are the details.

For an unambiguous grammar to be parsed in linear time, it must

Unmarked middle recursions?

Unmarked middle recursions are what they sound like: recursions that are not left and right, but in the middle of a rule, and for which there is no "marker". What's a marker? That gets tricky.

The marker of a middle recursion is anything that allows the parser to find the middle. It is possible to represent a halting Turing computation as a marker, so that the general problem of finding any possible marker is, in fact, undecidable. But that's not something you are likely to want to do in practice. For practical purposes, if you can spot the middle by eyeball, the middle recursion is "marked". If you can't, the middle recursion might be unmarked.

Ambiguous right recursions

How does an unambiguous grammar manage to include an ambiguous right recursion? The answer is not very easily, but you can sneak an ambiguous right recursion into an unambigious grammar, by having two different right recursive rules, both of which recurse on the same symbol. I call these ambiguities right recursive symches -- "symches" because they are ambiguous due to a choice between symbols. A right recursion can also be ambiguous because it is a "factoring" -- that is, it divides the input up differently among its symbols. But a factoring will make the whole grammar ambiguous, while a symch does not necessarily do so.

Right recursive symches are very easy to avoid. You just rewrite the rules so that they recurse on different symbols. Preserving the semantics is no problem in this case -- you simply make sure both of the new symbols have the same semantics as the original one.