Sun, 23 Oct 2011
Marpa::XS is now beta
I am very happy to announce that
Marpa::XS is now beta.
Marpa::XS will kept stable.
Changes to its interface will be especially avoided.
What is Marpa?
Marpa is an advance over recursive descent
I hope the Marpa algorithm
will become the standard parser for
big for regular expressions.
- Marpa parses,
in linear time,
classes of grammar that are currently in practical use.
- The importance of parse-time debugging is often underestimated.
Marpa's parse-time error detection breaks new ground -- Marpa is
fully aware of exactly where in the parse it is at all times,
and of exactly what input it expects and why.
This means parse-time error detection, once a desperate last
resort, now can be used as
a parsing technique in itself.
- Marpa is a general BNF parser -- that means if you feed it anything
written in BNF, it "just parses" it.
This includes grammars which are left-recursive, right-recursive and
ambiguous -- even infinitely ambiguous.
- Marpa never goes exponential -- worst case, even for highly ambiguous
grammars, is O(n**3), which is
New with this release
Marpa::XS 0.018000 contains two new
These ranking methods allow applications to control
the order of parse trees within
an ambiguous parse.
The two new ranking methods are the
ranking method and
high_rule_only ranking method.
In both methods, integer ranks are assigned to rules,
and alternative parsings are ordered on a top-down, left-to-right basis.
When it comes to comparing choices by rule rank, ties are allowed --
in fact, they are common.
By default, all the alternatives at every choice point
tie with each other.
When they are tied in rank,
the order of choices is arbitrary.
rule ranking method, all the parse trees are
kept, and visited in order.
high_rule_only ranking method,
only the highest ranked alternatives are kept,
and they are visited in arbitrary order.
Because of ties, more than one of the alternatives at a choice point
can be "highest".
A full description of the new ranking methods
in Marpa::XS's documentation.
What is next with Marpa?
Marpa::XS is aimed at users who want a stable platform for applications.
To ensure the stability of Marpa::XS,
active development of Marpa is moving into a new fork:
This will isolate Marpa::XS users from the accidental changes
and bugs that can be the side effect of active development.
Initially, changes to Marpa::XS will be restricted to bug fixes.
As it stands, Marpa::XS is very fully featured -- it offers applications
far more flexibility than competing parsers.
If I enhance the features of Marpa::XS,
the new features will be back-ported from Marpa's active development forks,
and I will preserve backward compatibility.
Limitations and comparisons
Marpa::XS is, as the name suggests, XS only --
installation requires access to a C compiler,
and to many of the GNU utilities and libraries as well.
Marpa::XS has been tested on a wide variety of POSIX systems.
In theory Marpa::XS is NOT restricted to POSIX systems --
all the tools and libraries
it uses have Windows versions, for example.
However, Marpa::XS has yet to be installed
on a non-POSIX system.
For a detailed comparison of Marpa to other parsers,
one which is careful to point out situations
where older parsing algorithms may still be superior,
"Limitations and Comparisons" section
in my announcement of Marpa::XS 0.008000.
- "in linear time":
To be specific, Marpa parses any LR-regular grammar in linear time -- O(n).
LR-regular is a vast class of grammars that includes LALR,
LR(k) for all k,
and LL(k) for all k.
This means that Marpa parses,
in linear time,
every grammar parsed by yacc,
recursive descent and regular expressions.
- "considered optimal":
The phrase "considered optimal" elides some
irrelevant bits of theory.
It would be a bit of a
surprise if it is possible to
do general BNF parsing
in O(n), but nobody has proved that it can't be done.
The Valiant algorithm parses general BNF
and is O(n**2.376...) or better.
But Valiant's algorithm is only faster for huge problems,
and for those it
needs a machine with many terabytes of main memory
to deliver on its speed claims.
So it won't be competing with Marpa any time soon.
- "GNU utilities and libraries":
These dependences can be an inconvenience, I admit, but
the alternative is installing
my attempt to portably re-create
all the things the GNU people have developed.
I think that it is clear that the GNU software is the easier
and more reliable alternative.
If you browse the package, you may see that it uses TeX as well.
TeX is ONLY needed is you are working on libmarpa,
the highly mathematical, low-level core library that provides
the parse engine.
To do this, you'd need to have studied a lot of the mathematics
of parsing -- and you'd understand why I feel forced to do the
documentation in TeX.
All the non-mathematical parts are either in Perl, or in C code
which can be read and changed on systems which do not have TeX
posted at: 22:45 |
direct link to this entry