Ocean of Awareness

Jeffrey Kegler's blog about Marpa, his new parsing algorithm, and other topics of interest

Jeffrey's personal website


Marpa resources

The Marpa website

The Ocean of Awareness blog: home page, chronological index, and annotated index.

Sat, 05 Jun 2010

Marpa is now O(n) for Right Recursions

There's news with the latest version of Marpa (0.102000). Marpa now parses grammars with right-recursions in linear time (O(n)). (Marpa already handled left-recursion in linear time.)

This means that Marpa is now O(n) for all LR-regular grammars. LR-regular means LR with infinite lookahead using regular expressions. That a big class of grammars. It obviously includes all the LR(k) grammars, and therefore everything parsed by Yapp, yacc, and bison. LR-regular grammars also include everything parseable by recursive descent, PEGs, and other LL(k) grammars. LR-regular definitely includes all regular expressions.

Marpa's O(n) behavior has another nice feature. When it does not parse in O(n) time, it still parses. Some parser generators always parse quickly, because when they can't parse quickly, they don't parse at all. Marpa will parse anything you can write in BNF, even highly ambiguous grammars, and the absolute worst case is cubic (O(n**3)).

In my last post, I explained that the previous release of Marpa could parse unusually large classes of grammars in linear time, and that the right recursive cases where Marpa was not linear could usually be worked around. In fact, my experience had been that working around a right recursion was easy, so I'd never bothered looking hard at the issue.

Sometimes, writing a long explanation of why a limitation does not matter makes me think: Perhaps it does matter enough to take a second look. And take a second look is what I did.

A 1991 article by Joop Leo had laid out a modification to Earley's algorithm (the basis of Marpa) which was O(n) for all LR-regular grammars. Problem was, Marpa already incorporated other, major, enhancements to Earley's from another article, this one by Aycock and Horspool and dating to 2002. Were the two modifications compatible?

They are. And Marpa 0.102000 is the result. CPAN and the Perl community has it, and everybody else will have it when they borrow from us.

posted at: 21:21 | direct link to this entry

§         §         §