In traditional Earley parsers, the concept of location is very simple. Locations are numbered from 0 to n, where n is the length of the input. Every location has an Earley set, and vice versa. Location 0 is the start location. Every location after the start location has exactly one input token associated with it.
Some applications do not fit this traditional input model — natural language processing requires ambiguous tokens, for example. Libmarpa allows a wide variety of alternative input models.
This document assumes that the reader knows the concepts
alternative input models, either from the documentation
of a higher level interface, such as
or from Marpa’s
As a reminder, in Libmarpa a location is called a earleme. The number of an Earley set is the ID of the Earley set, or its ordinal. In the traditional model, the ordinal of an Earley set and its earleme are always exactly the same, but in Libmarpa they will be different.