Ocean of Awareness

Jeffrey Kegler's blog about Marpa, his new parsing algorithm, and other topics of interest

Jeffrey's personal website


Marpa resources

The Marpa website

The Ocean of Awareness blog: home page, chronological index, and annotated index.

Fri, 14 Sep 2012

Parsing on your new hyper-quantum computer

If you want to build a ship, don't drum up the men to gather wood, divide the work and give orders. Instead, teach them to yearn for the vast and endless sea. -- Antoine de Saint-Exupery

Imagine that, the day the new machine arrives, you are maintaining a parser. Your parser is the current state of the art -- hand-written recursive descent.

The new computer will replace your staid old von Neumann box is not just a quantum computer. It's fully non-deterministic. You can superposition any time you'd like, and then "unsuperposition" to restart. And when superpositioning, you can can examine all the possibilities, not just one.

How would you rewrite your recursive descent logic to take advantage of this new hyper-quantum computer? Actually, this is exactly the same question that Marpa poses to you today. Because for all classes of grammar in practical use, including the LL(k) grammars parseable by recursive descent, Marpa is efficient non-determinism.

Your new hyper-quantum computer might seem at first to make your work as a programmer harder. On the hyper-quantum computer, many things are happening at once. On the deterministic box, you were dealing with a single procedural thread.

But as you get used to the new non-deterministic computer, you find ways in which it makes things easier. On the deterministic box, you only needed to follow a single thread, but you often needed to make that thread deal with multiple alternatives. To do this, you had to create state information and keep track of it yourself while backtracking. The complexity of dealing with all this roll-your-own state information severely limited the kinds of grammar that you could parse, and even the extent to which you understood exactly what your parser would and would not accept. Since much of the time on your old parser was spent backtracking, you no longer had a real handle on the time complexity in many sections of your code. In a couple of previous releases, minor changes had let the backtracking get out of hand.

The hyper-quantum computer now comes up with the parsing alternatives for you. True, you have to retrain yourself to think in terms of the alternatives available at any point. But you don't even have to know which alternatives are there -- if you need to ask the hyper-quantum computer (or Marpa), it can tell you.

You do have to learn to ask the right questions at the right places. Once you do this, you have a simpler and more cleanly written parser, running at comparable or faster speeds. And you begin to think about a few changes to the language, changes that will make your language more natural and expressive, but one which you could not have parsed before.

posted at: 21:38 | direct link to this entry

§         §         §