Next: , Previous: , Up: Top   [Contents][Index]


12 Sequence rules

Traditionally, grammars only allow BNF rules. Libmarpa allows sequence rules, which express sequences by allowing a single RHS symbol to be repeated.

A sequence rule consists of a LHS and a RHS symbol. Additionally, the application must indicate the minimum number of repetitions. The minimum count must be 0 or 1.

Optionally, a separator symbol may be specified. For example, a comma-separated sequence of numbers

     1,42,7192,711,

may be recognized by specifying the rule Seq ::= num and the separator comma ::= ','. By default, an optional final separator, as shown in the example above, is recognized, but “proper separation” may also be specified. In proper separation separators must, in fact, come between (“separate”) items of the sequence. A final separator is not a separator in the strict sense, and therefore is not recognized when proper separation is in effect. For more on specifying sequence rules, see marpa_g_sequence_new.

Sequence rules are “sugar” — their presence in the Libmarpa interface does not extend its power. Every Libmarpa grammar that can be written using sequence rules can be rewritten as a grammar without sequence rules.

The RHS symbol and the separator, if there is one, must not be nullable. This is because it is not completely clear what an application intends when it asks for a sequence of items, some of which are nullable — the most natural interpretation of this usually results in a highly ambiguous grammar.

Libmarpa allows highly ambiquous grammars and a programmer who wants a grammar with sequences containing nullable items or separators can write that grammar using BNF rules. The use of BNF rules make it clearer that ambiguity is what the programmer intended, and allows the programmer more flexibility.

A sequence rule must have a dedicated LHS — that is, the LHS of a sequence rule must not be the LHS of any other rule. This implies that the LHS of a sequence rule can never be the LHS of a BNF rule.

The requirement that the LHS of a sequence rule be unique is imposed for reasons similar to those for the prohibition against RHS and separator nullables. Often reuse of the LHS of a sequence rule is simply a mistake. Even when deliberate, reuse of the LHS results in a complex grammar, one which often parses in ways that the programmer did not intend.

A programmer who believes they know what they are doing, and really does want alternative sequences starting at the same input location, can specify this behavior indirectly. They can do this by creating two sequence rules with distinct LHS’s:

     Seq1 ::= Item1
     Seq2 ::= Item2

and adding a new “parent” LHS which recognizes the sequences as alternatives.

     SeqChoice ::= Seq1
     SeqChoice ::= Seq2

Next: , Previous: , Up: Top   [Contents][Index]