Mon, 22 Apr 2013
Marpa's SLIF now allows procedural parsing
Marpa's SLIF (scanless interface)
allows an application to parse directly from any BNF grammar.
Marpa parses vast classes of grammars in linear time,
including all those classes currently in practical use.
its latest release,
also allows an application to intermix
its own custom lexing and parsing logic
and to switch back and forth between them.
among other things,
that Marpa's SLIF can now
do procedural parsing.
What is procedural parsing?
Procedural parsing is parsing using
ad hoc code in a procedural language.
The opposite of procedural parsing is declarative parsing
-- parsing driven by some kind of formal description
of the grammar.
may be described as what you do when you've given up
on your parsing algorithm.
Dissatisfaction with parsing theory
has left modern programmers accustomed to procedural parsing.
And in fact some problems are best tackled with procedural parsing.
One such problem is parsing Perl-style here-documents.
Peter Stuifzand has tackled this using
just-released version of Marpa::R2.
For those unfamiliar, Perl allows documents to be incorporated
into its source files in line-oriented fashion as "here-documents".
Here-documents can be used in expressions.
The syntax to do this is very handy, if a little strange.
say <<ENDA, <<ENDB, <<ENDC; say <<ENDD;
starts with a single line declaring four here-documents spread out
The expressions of the form
are here-document expressions.
is the heredoc operator.
The string which follows it (in this example,
ENDB, etc.) is the heredoc terminator string --
the string that will signal end
of body of the here-document.
The body of the here-documents follow, in order, over the next eight lines.
More details of here-document syntax, with examples, can be found
All of this poses quite a challenge to a parser-lexer combination,
which is one reason I chose it as an example --
to illustrate that the Marpa's SLIF support for procedural parsing can
handle genuinely difficult cases.
There are a few ways Marpa could approach this.
Peter Stuifzand chose was to
to read the
here-document's body as the value of the terminator in
The strategy works this way:
Marpa allows the application to mark certain lexemes as "pause" lexemes.
Whenever a "pause" lexeme is encountered, Marpa's internal scanning stops,
and control is handed over to the application.
In this case, the application is set up to pause after every newline,
and before the terminator in every here-document expression.
While reading the line containing the four here-document expressions,
Marpa's SLIF pauses and resumes five times -- once for each here-document expression,
then once for the final newline.
Details can be found in compact form in the heavily commented code
Marpa as a better procedural parser
So far I've talked in terms of Marpa "allowing" procedural parsing.
In fact, there can be much more to it.
Marpa can make procedural parsing easier and more accurate.
Marpa knows, at every point, which rules it is recognizing, and how far it
is into them.
Marpa also knows which new rules the grammar expects, and which terminals.
The procedural parsing logic can consult this information to guide its decisions.
Marpa can provide your procedural parsing logic with radar,
as well as the option to use a very smart autopilot.
For more about Marpa
Marpa's latest version is
which is available on CPAN.
a new interface,
which represents a major increase
in Marpa's "whipitupitude".
The SLIF has tutorials
a web page,
and of course it is the focus of
my "Ocean of Awareness" blog.
Comments on this post
can be sent to the Marpa's Google Group:
posted at: 03:03 |
direct link to this entry