Ocean of Awareness
Thu, 31 Dec 2009
Marpa: Practical General BNF Parsing
With a regular expression engine, there
You feed a regular expression to the RE engine, it parses with it.
A general BNF parser is one which fulfills the same
expectation for BNF.
Write your language in BNF, you got a parser.
But it hasn't been that simple.
The guys who write the textbooks have pushed
general BNF parsing for years.
Improvements in these algorithms have pushed
the speeds down to linear or close to it
for the kinds of language in practical use.
But the general parsing algorithms have languished
on the textbook pages.
And I did find it wasn't quite
as easy as the academics suggested.
There were some obstacles that they didn't forsee.
But bottom line, they were right.
General BNF parsing is practical.
Marpa is a practical
(if at this point alpha)
general BNF parser
I have used Marpa to write a practical parser
for a non-trivial task -- HTML parsing.
That HTML parser is Marpa::UrHTML,
and I now use
in tasks I do routinely.
Over the next weeks, I will do phased releases of Marpa.
The HTML parser, Marpa::UrHTML,
is documented and ready to use in the first release.
Over the next few weeks,
I will document Marpa,
the parse engine itself.
Right now the Marpa parse engine is "pure Perl", and speeds
are on the order of PPI.
This is acceptable for many applications,
but Marpa can do better.
Marpa has as its parse engine a mathematical algorithm
that lends itself to conversion to C.
Further down the road, I'll write a Marpa C library,
and an XS wrapper.
posted at: 15:29 |
direct link to this entry