Ocean of Awareness

Jeffrey Kegler's blog about Marpa, his new parsing algorithm, and other topics of interest

Jeffrey's personal website

Google+

Marpa resources

The Marpa website

The Ocean of Awareness blog: home page, chronological index, and annotated index.

Sun, 07 Aug 2011


Changes to The Way Marpa Orders Parse Results

This post is for users of Marpa's Constant Ranking Method. You are using the Constant Ranking Method if you specify the "ranking_method" named argument of a Marpa recognizer, with the value "constant". If you're not using the Constant Ranking Method, you can stop reading here.

Marpa::XS 0.008000 is the last release that will support the Constant Ranking Method. In future releases of Marpa, Marpa::PP, and Marpa::XS, the Constant Ranking Method may be removed. At a minimum, it will behave differently at the interface level.

Marpa is alpha but previously, whenever I've changed the documented behavior of an interface, I have kept backward compatibility. As alpha development ends and I approach a beta release, I am forced to be more ruthless. I will be changing or eliminating the Constant Ranking Method, and duplicating its exact semantics for backward compatibility is simply too difficult.

By setting the ranking method to "constant", users have been able to control the order in which the parse results of an ambiguous parse are returned. The Constant Ranking Method never offered more than limited capabilities for ordering parses. Marpa applications which needed a general capability for ranking parses have always needed to produce all the parses, with ranking information included, and to sort them at the application level, outside of Marpa.

Many applications using ambiguous grammars are quite happy to get the parse results in arbitrary order. That was, and remains, Marpa's default behavior.

Embedding a fully general parse-ranking logic in the evaluator would be high cost and have no added value --- ordering the parses inside Marpa's evaluation logic would have no advantage over sorting the parses at the application level. The Constant Ranking Method was an attempt at a compromise. The intention was that it would provide a set of features that

  1. Was flexible enough for many applications;
  2. Added value, in the sense that it was more efficient than the user doing the same thing herself at the application level;
  3. Had little cost for users who were not using it.

But the Constant Ranking Method, as currently implemented, overreaches. It is reasonably flexible, but the code for it is grotty, and its efficiency advantages are debatable. Facing a C conversion of the code, I realize that the right place for much or all of the current Constant Ranking Method logic is the bit bucket.


posted at: 11:28 | direct link to this entry

§         §         §