ANTLR is a powerfull tool for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages. An important feature of ANTLR is that it has a sophisticated grammar development environment called
ANTLRWorks that helps a lot when debugging grammars.
I tried to get started with ANTLR by building a Jinja-like template engine for Java which I called Jinja4j.
Jinja is one of the most used template engines for Python. It extends Django's templating system with an expressive language that gives template authors a more powerful set of tools. On top of that it adds sandboxed execution and optional automatic escaping for applications where security is important.
The Jinja4j grammar can be found
here. The project is open source, any pull requests for imporving the template engine are welcome. Code source can be found at
github.
Debugging with ANTLRWorks
When working with the grammar under ANTLRWorks I was getting in the console erros like
NoViableAltException and "no start rule (no rule can obviously be followed by EOF)". When I googled the error message, I found an
intersting post explain that the problem. In fact Antlr was trying to identify the "start" rules (ones that can end with EOF) by looking for rules that are not used anywhere else in the grammar. So if you have recursion on the "start" rule ANTLR will promot
NoViableAltException and the debug will fail. An thus, I added this rule
page : html EOF; to fix the problem.
Another kind of problems I was getting is
mismatch errors. This happens when for a given input word there are more than one possible token that can be created. Antlr prompt the identifiers of mismatched tokens that can be found in
output/gramar_name.tokens file.