Thursday, April 28, 2005

Writing scanners and parsers in Python

In writing Noodle, I've spent a good deal of time looking around for decent Python-oriented scanner and parser tools/generators. I ended up writing my own scanner (since the needs were not very complex) and using yacc.py standalone from the PLY project.

But since that time, my friend Travis Hartwell pointed me to this (discussion on the undocumented and "experimental" sre.Scanner stuff in the Python standard library, which looks perfect) and this, a summary of Python parsing tools which I somehow completely missed in my search.

Through that, I came to PyBison, which looks more than a little interesting, and to which I will certainly be migrating, barring unforeseen difficulties. It's under the GPL, and I plan to have Noodle under an MIT-style license, but one would expect the generated code need not be GPL'd. It will depend on the author's wishes and how much nontrivial glue code is emitted with the generated output.

As far as I understand, PyBison uses Python docstrings to create input to bison, runs bison, and puts in some extra glue to get results available to Python again. That may save me a lot of work, since I was planning to do the bison stuff directly.




Next Monday Novell's Open Source Review Board will meet with Noodle on the agenda. I work at Novell and want to make sure the ownership status of Noodle is in the clear before releasing anything.

0 comments: