Ocean of Awareness

Jeffrey Kegler's blog about Marpa, his new parsing algorithm, and other topics of interest

Jeffrey's personal website


Marpa resources

The Marpa website

The Ocean of Awareness blog: home page, chronological index, and annotated index.

Fri, 07 Oct 2011

Perl and Parsing 11: Are all Perl programs parseable?

Before going further in this series, I want to touch on a question I have so far avoided: Are all Perl programs parseable?

Most languages do not pose this question. It was Adam Kennedy, in creating PPI, who first ran up against it. Adam conjectured that in fact Perl parsing may be undecidable. While in the planning stages for Marpa, I found Adam's conjecture and turned it into a formal proof.

Being the first to formalize this result, I took the initial heat over it, and now I get the credit. The dust has pretty much settled, and to my mild surprise, the result has proved of interest to the academic community. The initial revulsion at the idea of undecidable parsing has subsided, even to the extent that others now want to get into the act. I read on Wikipedia that LISP parsing is also undecidable. So all the cool kids seem to be doing it.

Undecidability is actually something we are quite comfortable with -- all the general-purpose CPU chips we use have this property, and we put up with it as the price for their power. In the case of general-purpose CPU chips, the downside is real and occasionally serious. Some of the things our computers cannot do are reasonable things for us to want them to do. As one example, infinite loop detection can never be 100% accurate. As another, very serious issue, no program will ever be able to determine with 100% accuracy whether an arbitrary program is or is not a virus. Virus detection must always involve a degree of guesswork.

The undecidability of Perl parsing differs from the usual situation in one major respect: In return for the additional power of Perl's parser, we seem to be giving up absolutely nothing. Nobody has ever been able to show a practical downside.

True, as I have proved, there are Perl programs that no parser, including the perl interpreter itself, can parse. But all the examples that anyone, including me, has been able to come up are contrived cases. None of them are Perl programs that someone might care about as a practical matter.

In future posts, I plan to discuss various strategies for parsing Perl. I will not mention the undecidability issue again, unless it has some relevance in context -- something that will seldom happen.


  1. "this series": Previous posts in this series have been

posted at: 21:51 | direct link to this entry

§         §         §