Ocean of Awareness

Jeffrey Kegler's blog about Marpa, his new parsing algorithm, and other topics of interest

Jeffrey's personal website

Google+

Marpa resources

The Marpa website

The Ocean of Awareness blog: home page, chronological index, and annotated index.

Sat, 10 Mar 2012


Making the parsing game safe

In previous posts, I've talked about Marpa as an alternative to other parsers. In this one, I want to talk about Marpa as an alternative for problems where parsing has been avoided.

Because parsing HAS been avoided in the past. And for good reason. If you were drawn by the allure of domain-specific languages, or yielded to the siren call of language-oriented programming, you plunged headlong toward two pitfalls:

By approaching your problem as ANYTHING but a parsing problem, you avoided these two pitfalls. Ambitious programmers, after a few encounters with the traditional parsing tools, would learn this. And the next time they dreamed up an elegant little DSL to finesse their design issues, they would wake themselves up and decide that they ain't that desperate yet.

Changing the parsing game

With Marpa in the parsing game, the rules are different. Now, anything you can write in BNF will parse. If your grammar falls into anything close to one of the classes of grammar currently in practical use, Marpa parses in linear time. If there's a problem, Marpa tells you exactly what it was looking for and why it was looking for it.

The Interpreter pattern, domain-specific languages, and language-oriented programming are all immensely powerful techniques. Almost any problem CAN be seen as the domain of a language. In practice, less powerful techniques are often a better fit. And with the traditional language-writing tools, it was a rare problem indeed for which a DSL was seen to justify the risk and effort.

Since it was so hard to create a new language, reuse of languages was emphasized instead. We've gotten used to the idea of leveraging existing languages, even ones which are a very poor fit to our problems, because the alternative was even worse.

More and better DSL's could breathe new life into our programming tools, and our programming methods. And now that the parsing game has become easier, DSL's are within reach in cases where they had not been before.

Note

"parsing has been avoided": For the purposes of this post, I do not include regular expressions as "parsing solutions". This post focuses on languages, and the term "language" is usually avoided when describing anything within the restricted syntax accepted by regular expressions. However, regular expressions do largely avoid the pitfalls described in this post, and that goes a long way to explain their popularity.


posted at: 19:49 | direct link to this entry

§         §         §