Tue, 25 Feb 2014
Significant newlines? Or semicolons?
Should statements have explicit terminators, like the semicolon of Perl and
the C language?
Or should they avoid the clutter, and separate statements by giving whitespace
syntactic significance and a real effect on
Actually we don't have to go either way.
As an example, let's look at some BNF-ish DSL.
It defines a small calculator.
At first glance, it looks as if this language has taken the
significant-whitespace route -- there certainly are no explicit statement
:default ::= action => ::first
:start ::= Expression
Expression ::= Term
| Term '+' Term action => do_add
| Factor '*' Factor action => do_multiply
Number ~ digits
digits ~ [\d]+
:discard ~ whitespace
whitespace ~ [\s]+
The rule is that there isn't one
If we don't happen to like the layout of the above DSL,
and rearrange it in various ways,
we'll find that everything we try works.
If we become curious about what exactly what the rules for newlines are,
and look at
we won't find any.
That's because there aren't any.
We can see this by thoroughly messing up the line structure:
:default ::= action => ::first :start ::= Expression Expression ::= Term
Term ::= Factor | Term '+' Term action => do_add Factor ::= Number |
Factor '*' Factor action => do_multiply Number ~ digits digits ~
[\d]+ :discard ~ whitespace whitespace ~ [\s]+
script will continue to run just fine.
How does it work?
How does it work?
Actually, pose the question this way:
Can a human reader tell where the statements end?
If the reader is not used to reading BNF,
he might have trouble with this
particular example but,
for a language that he knows, the answer is simple:
Yes, of course he can.
So really the question is,
why do we expect the parser to be so stupid that it cannot?
The only trick is that this is done without trickery.
Marpa's DSL is
written in itself,
and Marpa's self-grammar describes exactly what a statement is
and what it is not.
The Marpa parser is powerful enough to simply take this self-describing DSL
and act on it, finding where statements begin and end,
much as a human reader is able to.
To learn more
This example was produced with the Marpa parser.
is available on CPAN.
The code for this example is based on that in
the synopsis for its top-level document,
but it is isolated conveniently in
a Github gist.
A list of my Marpa tutorials can be found
new tutorials by
The Ocean of Awareness blog
focuses on Marpa,
and it has
an annotated guide.
a web page that I maintain
and Ron Savage maintains
For questions, support and discussion, there is
the "marpa parser"
Comments on this post can be made there.
posted at: 15:30 |
direct link to this entry