Fri, 14 Oct 2011
Announcing Marpa::XS 0.016000
a week ago and the
cpantesters results look excellent.
With this release, my conversion of Marpa from Perl to C
A lot of Perl code remains, to be sure,
but all of it is code that
arguably belongs in some kind of
This release was checked
for leaks and other memory issues.
The couple of issues that turned up were fixed.
What is Marpa?
Marpa is an advance over recursive descent
I hope the Marpa algorithm
will become the standard parser for
big for regular expressions.
- Marpa parses
in linear time,
classes of grammar that are currently in practical use.
- The importance of parse-time debugging is often underestimated.
Marpa's parse-time error detection breaks new ground -- Marpa is
fully aware of exactly where in the parse it is at all times,
and of exactly what input it expects and why.
This means parse-time error detection, once a desperate last
resort, now can be used as
a parsing technique in itself.
- Marpa is a general BNF parser -- that means if you feed it anything
written in BNF, it "just parses" it.
This includes grammars which are left-recursive, right-recursive and
ambiguous -- even infinitely ambiguous.
- Marpa never goes exponential -- worst case, even for highly ambiguous
grammars, is O(n**3), which is
What is next with Marpa?
At this point,
little remains to be done before a
a 1.000000 beta release of Marpa::XS.
Once Marpa::XS does go beta, I expect to be able to keep
its interface stable.
the portions converted to C amount to
a complete, if low-level, parsing library.
The libmarpa library
does not, however, have the documentation that you'd expect in
a library being released on its own.
Also, frankly, before writing the documentation,
I need to redo the interface to libmarpa.
As long as the interface to libmarpa was a strictly internal affair,
I didn't worry about it much --
I made the first cut at a design, and got it working.
I then looked at the result.
If the design was so awful that
it got in the way of features or was an efficiency issue,
I fixed it.
If ugliness or awkwardness was its main issue, I moved on.
With the first pass and a few trial applications behind me,
know how the libmarpa interface should look.
Once I have libmarpa finished and documented,
the next step will be Marpa::Thin.
These days a lot of people like to hack interfaces.
The Marpa project is about the parse engine -- when it
comes to interfaces, I want to put everyone else on a level
playing field with me, if not get out of the game altogether.
Marpa::Thin will be a "thin" Perl interface to libmarpa.
Currently, I plan to create a SWIG interface to libmarpa,
which means that any environment that SWIG supports
(and there are a lot of them)
will have access to the low-level Perl library.
I have mixed feelings about Marpa leaving its Perl home,
but I think most Perlers share my belief --
we contribute to the Perl community,
not because it is a goal in itself,
but because it is the best way to
contribute to a larger community.
Limitations and Comparisons
Currently, the major limitation of Marpa::XS is
that it is alpha.
Development is well advanced, but the interface remains
subject to change.
For a comparison of Marpa to other parsers,
one which is careful to point out situations
where older parsing algorithms may still be superior,
"Limitations and Comparisons" section
in my announcement of Marpa::XS 0.008000.
- "higher-level language":
Of course, depending on the application, the ideal "high-level language"
may be C.
But I feel no real need to convert the code for the user interfaces
and the scripts for building and testing really should be in a
highly portable high-level language.
- "in linear time":
To be specific, Marpa parses any LR-regular grammar in linear time -- O(n).
LR-regular is a vast class of grammars that includes LALR,
LR(k) for all k,
and LL(k) for all k.
This means that Marpa parses,
in linear time,
every grammar parsed by yacc,
recursive descent and regular expressions.
- "considered optimal":
The phrase "considered optimal" elides some
irrelevant bits of theory.
It would be a bit of a
surprise if it is possible to
do general BNF parsing
in O(n), but nobody has proved that it can't be done.
The Valiant algorithm parses general BNF
and is O(n**2.376...) or better.
But Valiant's algorithm is only faster for huge problems,
and for those it
needs a machine with many terabytes of main memory
to deliver on its speed claims.
So it won't be competing with Marpa any time soon.
posted at: 23:10 |
direct link to this entry