Extracting a class from a PetitParser script

TL;DR

Although it is convenient to rapidly develop a parser as a script using PetitParser rules, at some point to support regression testing and deployment for other purpose, it will be necessary to turn the script into a class. We show how to automate this for the PetitParser SPL case study.

The complete SPL grammar as a PetitParser script

Here we see the complete parser for SPL as a script. Note that the rules for expression, unary and statement are first defined as instances of PP2UnresolvedNode PP2Node subclass: #PP2UnresolvedNode instanceVariableNames: '' classVariableNames: '' package: 'PetitParser2-Core' , and then later redefined recursively with def:.

comment := '//' asPParser , #newline asPParser negate star.
ignorable := (comment / #space asPParser) star.

boolean := ('true' asPParser trim: ignorable)
		/ ('false' asPParser trim: ignorable).

integer := #digit asPParser plus , $. asPParser not trim: ignorable.

float := $- asPParser optional , #digit asPParser plus , $. asPParser
		, #digit asPParser plus trim: ignorable.

number := integer / float.

string := $" asPParser , $" asPParser negate plus , $" asPParser.

keyword := ('var' asPParser , #letter asPParser not)
		/ ('if' asPParser , #letter asPParser not)
		/ ('else' asPParser , #letter asPParser not)
		/ ('while' asPParser , #letter asPParser not)
		/ ('true' asPParser , #letter asPParser not)
		/ ('false' asPParser , #letter asPParser not)
		/ ('and' asPParser , #letter asPParser not)
		/ ('or' asPParser , #letter asPParser not).

identifier := keyword not , #letter asPParser , #word asPParser star.

"recursively defined rules"

expression := PP2UnresolvedNode new.

varDecl := (('var' asPParser trim: ignorable) , identifier
		, (($= asPParser trim: ignorable) , expression) optional
		, ($; asPParser trim: ignorable))
		==> [ :node | 
			node third
				ifNil: [ SPLDeclaration for: node second ]
				ifNotNil: [ SPLInitializedDeclaration for: node second with: node third second ] ].

parenthesizedExpression := ($( asPParser trim: ignorable) , expression
		, ($) asPParser trim: ignorable).

primary := parenthesizedExpression / boolean / number / string / identifier.

negatedUnary := ($! asPParser trim: ignorable) / ($- asPParser trim: ignorable)
		, unary.

unary := PP2UnresolvedNode new.

unary def: negatedUnary / primary.

factor := unary
		, (($/ asPParser trim: ignorable) / ($* asPParser trim: ignorable) , unary) star.

term := factor
		, (('-' asPParser trim: ignorable) / ('+' asPParser trim: ignorable) , factor)
				star.

comparison := term
		, (('>=' asPParser trim: ignorable) / ('>' asPParser trim: ignorable)
				/ ('<=' asPParser trim: ignorable) / ('<' asPParser trim: ignorable) , term)
				star.

equality := comparison
		, (('!!=' asPParser trim: ignorable) / ('==' asPParser trim: ignorable)
				, comparison) star.

logicAnd := equality , (('and' asPParser trim: ignorable) , equality) star.

logicOr := logicAnd , (('or' asPParser trim: ignorable) , logicAnd) star.

assignmentExpression := (identifier trim: ignorable)
		, ($= asPParser trim: ignorable) , assignment.

assignment := assignmentExpression / logicOr.

expression def: assignment.

exprStmt := expression , ($; asPParser trim: ignorable).

printStmt := ('print' asPParser trim: ignorable) , expression
		, ($; asPParser trim: ignorable).

statement := PP2UnresolvedNode new.

declaration := varDecl / statement.

ifStmt := ('if' asPParser trim: ignorable) , ($( asPParser trim: ignorable)
		, expression , ($) asPParser trim: ignorable) , statement
		, (('else' asPParser trim: ignorable) , statement) optional.

whileStmt := ('while' asPParser trim: ignorable)
		, ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable)
		, statement.

block := ('{' asPParser trim: ignorable) , declaration star
		, ($} asPParser trim: ignorable).

statement def: ifStmt / printStmt / whileStmt / exprStmt / block.

program := declaration star end.

  

We can test that this will parse a simple SPL program.

program parse: '// My first SPL program
var hello = "Hello world";
print hello;'
  

Extracting a parser class from a script

Normally we would turn parser into a class at an earlier stage to enable Example-driven development by example (essentially TDD using example methods). Instead we have developed the full parser as a script, and then we will also turn our scripted tests into examples.

To turn the script into a class, we need:

1. a class MyParser that subclasses PP2CompositeNode PP2DelegateNode subclass: #PP2CompositeNode instanceVariableNames: '' classVariableNames: '' package: 'PetitParser2-Tools'

2. for every grammar rule x in the script, there must be both a method called x returning that parser, and a slot named x that will store an instance of the parser

3. there must be a start rule and slot for the root of the grammar.

This can be created manually, but the class can also be extracted automatically from the script. However, the script must only contain rule definitions, and there may be no references to PP2UnresolvedNode PP2Node subclass: #PP2UnresolvedNode instanceVariableNames: '' classVariableNames: '' package: 'PetitParser2-Core' or any other external variables or classes.

This is relatively easy to do. First we evaluate the original version of the script, to initialize all rule variables. Then we copy the script, comment out or remove the lines that initialize recursive rules to PP2UnresolvedNode PP2Node subclass: #PP2UnresolvedNode instanceVariableNames: '' classVariableNames: '' package: 'PetitParser2-Core' , and replace the def: sends by assignments.

Here's the result for our script:

comment := '//' asPParser , #newline asPParser negate star.
ignorable := (comment / #space asPParser) star.

boolean := ('true' asPParser trim: ignorable)
		/ ('false' asPParser trim: ignorable).

integer := #digit asPParser plus , $. asPParser not trim: ignorable.

float := $- asPParser optional , #digit asPParser plus , $. asPParser
		, #digit asPParser plus trim: ignorable.

number := integer / float.

string := $" asPParser , $" asPParser negate plus , $" asPParser.

keyword := ('var' asPParser , #letter asPParser not)
		/ ('if' asPParser , #letter asPParser not)
		/ ('else' asPParser , #letter asPParser not)
		/ ('while' asPParser , #letter asPParser not)
		/ ('true' asPParser , #letter asPParser not)
		/ ('false' asPParser , #letter asPParser not)
		/ ('and' asPParser , #letter asPParser not)
		/ ('or' asPParser , #letter asPParser not).

identifier := keyword not , #letter asPParser , #word asPParser star.

varDecl := (('var' asPParser trim: ignorable) , identifier
		, (($= asPParser trim: ignorable) , expression) optional
		, ($; asPParser trim: ignorable))
		==> [ :node | 
			node third
				ifNil: [ SPLDeclaration for: node second ]
				ifNotNil: [ SPLInitializedDeclaration for: node second with: node third second ] ].

parenthesizedExpression := ($( asPParser trim: ignorable) , expression
		, ($) asPParser trim: ignorable).

primary := parenthesizedExpression / boolean / number / string / identifier.

unary := negatedUnary / primary.

negatedUnary := ($! asPParser trim: ignorable) / ($- asPParser trim: ignorable)
		, unary.

factor := unary
		, (($/ asPParser trim: ignorable) / ($* asPParser trim: ignorable) , unary) star.

term := factor
		, (('-' asPParser trim: ignorable) / ('+' asPParser trim: ignorable) , factor)
				star.

comparison := term
		, (('>=' asPParser trim: ignorable) / ('>' asPParser trim: ignorable)
				/ ('<=' asPParser trim: ignorable) / ('<' asPParser trim: ignorable) , term)
				star.

equality := comparison
		, (('!!=' asPParser trim: ignorable) / ('==' asPParser trim: ignorable)
				, comparison) star.

logicAnd := equality , (('and' asPParser trim: ignorable) , equality) star.

logicOr := logicAnd , (('or' asPParser trim: ignorable) , logicAnd) star.

assignmentExpression := (identifier trim: ignorable)
		, ($= asPParser trim: ignorable) , assignment.

assignment := assignmentExpression / logicOr.

expression := assignment.

exprStmt := expression , ($; asPParser trim: ignorable).

printStmt := ('print' asPParser trim: ignorable) , expression
		, ($; asPParser trim: ignorable).

declaration := varDecl / statement.

ifStmt := ('if' asPParser trim: ignorable) , ($( asPParser trim: ignorable)
		, expression , ($) asPParser trim: ignorable) , statement
		, (('else' asPParser trim: ignorable) , statement) optional.

whileStmt := ('while' asPParser trim: ignorable)
		, ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable)
		, statement.

block := ('{' asPParser trim: ignorable) , declaration star
		, ($} asPParser trim: ignorable).

statement := ifStmt / printStmt / whileStmt / exprStmt / block.

program := declaration star end.
  

Now we can right-click and select Extract PetitParser class. We give a name to the class, apply the generated refactoring, and obtain something like this:

parser := SPLGrammar new.
  

Note that the extraction will automatically introduce a start method for the last production in the script. SPLGrammar>>#start start ^ program

We verify that the extracted class works as before.

parser parse: '// My first SPL program
var hello = "Hello world";
print hello;'