viernes, 8 de enero de 2016

dabbling with metacompilers

Lately, I've been reading about metacompilers, and I have to say it's a really impressive piece of technology.  A compiler that can read high level descriptions of grammars to generate other compilers, and it can generate itself. And once you have a description of itself, you can keep tweaking both syntax and semantics using a two step compilation.

It all started in this page when I saw what seemed a fine tutorial with some lua code as example.  I read the code a few times and the amusement was bigger the more I understood what was all that about.  There are very few resources on this technique on the web, so the chances of having to understand everything by myself were big. Btw, the original paper from Schorre is here

Big plans

 While practicing with it I tried to write a parser for lua, because , as you know, lua syntax is quite simple. The first problem was that most syntax descriptions out there are in ebnf... So I thought I should use metaII to write a stepping stone compiler that would undesrtand ebnf, and then feed it the lua syntax. And then I would be happy and have my utterly useless lua parser.

Problems 

MetaII has its own problems, like no backtracking, and really poor error handling. so your parsing either fails or succeeds, but you have notmuch info where or why....

The no backtracking issue is a big one, as ebnf syntax is difficult to convert to a dfa-like grammar. I kept falling into infinite left recursions and dying out of 'stack limit reached'.

Slow Start

So I wiped everything and went back to the basics and started by doing really stupid changes to the metaII syntax. For now I've added comments to it. And I still have the ebnf branch 'alive', so we'll see if I can manage to do something with it.


It's nice I only had to add support for .line, comment and accept comment in place of a rule. It's worth noting that the syntax approach of the compiler makes it easy to have comments in place of full rules, but it would be more complex to have 'line oriented' parsing rules instead of syntax ones. Btw, the only change in the runtime is to add
 local function parseCMT() return read(match("^[^\n]*")) end

More reading

Since I started this adventure, I've read Alessandro Warth Phd about Ometa (I heard about it many many times, but now finally I understand it), and read this metacompilers tutorial on and off. So even with the not-so-much-success situation, the learning is there :)

The end?

Probably no, but I wanted just to write some of the progress in case anyone wants to join me in the quest.