Jeffrey Kegler's blog about Marpa, his new parsing algorithm, and other topics of interest
The Ocean of Awareness blog: home page, chronological index, and annotated index.
In a recent post, I looked at an unusual language which serializes arrays and strings, using a mixture of counts and parentheses. Here is an example:
A2(A2(S3(Hey)S13(Hello, World!))S5(Ciao!))
The language is of special interest for comparison against recursive descent because, while simple, it requires procedural parsing -- a purely declarative BNF approach will not work. So it's a chance to find out if Marpa can play the game that is recursive descent's specialty.
The previous post focused on how to use Marpa to mix procedural and declarative parsing together smoothly, from a coding point of view. It only hinted at another aspect: speed. Over the last year, Marpa has greatly improved its speed for this kind of application. The latest release of Marpa::R2 now clocks in almost 100 times faster than Parse::RecDescent for long inputs.
Length | Seconds | ||
---|---|---|---|
Marpa::R2 | Marpa::XS | Parse::RecDescent | |
1000 | 1.569 | 2.938 | 13.616 |
2000 | 2.746 | 7.067 | 62.083 |
3000 | 3.935 | 13.953 | 132.549 |
10000 | 12.270 | 121.654 | 1373.171 |
Parse::RecDescent is pure Perl, while Marpa is based on a parse engine in a library written in hand-optimized C. You'd expect Marpa to win this race and it did.
And it is nice to see that the changes from Marpa::XS to Marpa::R2 have paid off. Included in the table are the Marpa numbers from my 2012 benchmark of Marpa::XS. Marpa::R2 has a new interface and an internal lexer, and now beats Marpa::XS by a factor of up to 10.
While the benchmarked language is ideally suited to show recursive descent to advantage, the input lengths were picked to emphasize Marpa's strengths. Marpa optimizes by doing a lot of precomputation, and is written with long inputs in mind. Though these days, a 500K source, longer than the longest tested, would not exactly set a new industry record.
There are fuller descriptions of the language in Flavio's post and code, and my recent post on how to write a parser for this language. I talk more about the benchmark's methodology in my post on the 2012 benchmark.
Marpa::R2
is available on CPAN.
A list of my Marpa tutorials can be found
here.
There is
a new tutorial by Peter Stuifzand.
The Ocean of Awareness blog
focuses on Marpa,
and it has
an annotated guide.
Marpa also has
a web page.
For questions, support and discussion, there is a
Google Group:
marpa-parser@googlegroups.com
.
Comments on this post can be made there.
posted at: 09:47 | direct link to this entry