Next: , Previous: , Up: Terms   [Contents][Index]


5.12 Ambiguity

In our discussion of evaluation above (see Semantics terms), we spoke of evaluating trees. In fact, the result of a successful Libmarpa parse run is a parse forest. We say that a Libmarpa parse run is ambiguous iff the parse run returns a forest containing more than one parse tree. We say that a succcessful Libmarpa parse run is unambiguous iff it contains exactly one parse tree.

Most applications care only about one parse tree. An application is free to decide what to do in case a parse forest is ambiguous. Among the application’s options are picking one tree and evaluating that tree; iterating through the parse trees; or treating the ambiguity as an error.

Libmarpa’s treatment of ambiguity differs somewhat from the traditional one, for two reasons. First, Libmarpa allows ambiguous tokens. Second, Libmarpa prunes nulled subtrees back to their topmost nulled symbol.

An ambiguous token occurs when the same lexeme can be recognized as more than one token symbol. Traditionally, parsers do not allow this, but it is the most natural way of parsing some real-life applications. For example, natural languages often allow words to be used as more than one part of speech. In this document, the English word “parse” is used as an adjective, a verb, and a noun, and each of these uses is common. Ambiguous tokens are a source of ambiguity not present in traditional parsing. See Ambiguous input.

On the other hand, the pruning of nulled subtrees eliminates a source of ambiguity that is present in traditional parsing. Multiple nulled subtrees may share their topmost nulled symbol. When the subtrees are pruned to their shared root, ambiguity is removed. See Nullability.


Next: , Previous: , Up: Terms   [Contents][Index]