Parsers and Useful Power

Mon, 14 May 2018

Parsers and Useful Power

What do parser users want? What makes a parser[1] successful? In this post I will look at one aspect of that question, in light of an episode in the history of parsing.

Irons 1961

The first paper fully describing a parser was Irons 1961[2]. The Irons parser was what is called "general", meaning that it can parse all of the "context-free grammars". That makes it far more powerful than most parsers in practical use today.

But the Irons algorithm was not always fast in the general case. Irons 1961 used backtracking to achieve its power, so it would go exponential for many useful grammars.

Among the grammars Irons 1961 could not parse quickly were those containing the all-important arithmetic expressions. Irons 1961 gave way to recursive descent.

Recursive descent (RD) in its pure form, could not parse arithmetic expressions at all, but it could be customized with procedural code. That is, it could call specialized parsers which were reliably fast for specific sections of the input. The Irons parser was declarative, and not easy to cusomtize.

Raw power versus useful power

The contest between Irons parsing and recursive descent took place before the theory for analyzing algorithms was fully formed.[3] In retrospect, we can say that, except in specialized uses, an acceptable parser for most practical uses must be linear or quasi-linear.[4] That is, the "useful power" of a parser is the class of grammars that it will parse in quasi-linear time.[5]

Useful power turns out to be more important, in practice, than raw power. Recursive descent won out over the Irons algorithm because, while the Irons algorithm had vastly more raw power, RD had slightly more "useful power".

It is nice to have raw power as well -- it means an algorithm can take on some specialized tasks. And raw power provides a kind of "soft failure" debugging mode for grammars with, for example, unintended ambiguities. But, in the eyes of the programming community, the more important measure of a parser is its useful power -- the class of grammars that it will parse at quasi-linear speed.

Stating the obvious?

That useful power is more important than raw power may seem, in retrospect, obvious. But in fact, it remains a live issue. In practice raw power and useful power are often confused. The parsing literature is not always as helpful as it could be: it can be hard to determine what the useful power of an algorithm is.

And the Irons experiment with raw power is often repeated, in hopes of a different result. Very often, a new algorithm is a hybrid of two others: an algorithm with a lot of raw power, but which can go quadratic or worse; and a fast algorithm which lacks power. When the power of the fast algorithm fails, the hybrid algorithm switches over to the algorithm with raw power.

It is a sort of cross-breeding of algorithms. The hope is that the hybrid algorithm has the best features of each of its parents. This works a lot better in botany than it does in parsing. Once you have a successful cross in a plant, you can breed from the successful hybrid and expect good things to happen. In botany, the individual crosses can have an extremely high failure rate, and cross-breeding can still succeed. But it's different when you cross algorithms: Even after you've succeeded with one parse, the next parse from your hybrid is a fresh new toss of the dice.

References, comments, etc.

To learn about my own parsing project, Marpa[6], there is the semi-official web site, maintained by Ron Savage. The official, but more limited, Marpa website is my personal one. Comments on this post can be made in Marpa's Google group, or on our IRC channel: #marpa at freenode.net.

Footnotes

1. By "parser" in this post, I will mean a programmer's most powerful toolbox parser -- what might be called the "flagship" parser. No parser will ever be the right one for all uses. ↩

2. For the reference to Irons, see V3 of my "Parsing: A Timeline". The "Timeline" contains the background material for this post. ↩

3. Even the term "analysis of algorithms" did not exist until 1969: see Knuth, "Recent News". ↩

4. For more about "linear" and "quasi-linear", including definitions, see V3 of my "Parsing: A Timeline", in particular its 'Term: linear' section. ↩

5. While it is clearly the consensus among practitioners and theoreticians that, for parsing, practical time is quasi-linear or better, there are those who argue that worse-than-quasi-linear parsers are often the right ones for the job, and that research on them has been unwisely neglected. The dissenters are not without a case: For example, in natural language, while sentences are in theory infinite in length, in practice their average size is fixed. And while very long difficult-to-parse sentences do occur in some texts, such as older ones, it is normal for a human reader to have to spend extra time on them. So it may be unreasonable to insist that a parsing algorithm be quasi-linear in this application. ↩

6. Marpa's useful power is LR-regular, which properly contains every class of grammar in practical use: regular expressions, LALR, LL(k) for all k, LR(k) for all k, and the LL-regular grammars. ↩

posted at: 06:01 | direct link to this entry

§ § §

Ocean of Awareness

Marpa resources

Mon, 14 May 2018