Extracting a class from a PetitParser script
TL;DR
Although it is convenient to rapidly develop a parser as a script using PetitParser rules, at some point to support regression testing and deployment for other purpose, it will be necessary to turn the script into a class. We show how to automate this for the PetitParser SPL case study.
The complete SPL grammar as a PetitParser script
Here we see the complete parser for SPL as a script. Note that the rules for expression
, unary
and statement
are first defined as instances of PP2UnresolvedNode
, and then later redefined recursively with def:
.
comment := '//' asPParser , #newline asPParser negate star. ignorable := (comment / #space asPParser) star. boolean := ('true' asPParser trim: ignorable) / ('false' asPParser trim: ignorable). integer := #digit asPParser plus , $. asPParser not trim: ignorable. float := $- asPParser optional , #digit asPParser plus , $. asPParser , #digit asPParser plus trim: ignorable. number := integer / float. string := $" asPParser , $" asPParser negate plus , $" asPParser. keyword := ('var' asPParser , #letter asPParser not) / ('if' asPParser , #letter asPParser not) / ('else' asPParser , #letter asPParser not) / ('while' asPParser , #letter asPParser not) / ('true' asPParser , #letter asPParser not) / ('false' asPParser , #letter asPParser not) / ('and' asPParser , #letter asPParser not) / ('or' asPParser , #letter asPParser not). identifier := keyword not , #letter asPParser , #word asPParser star. "recursively defined rules" expression := PP2UnresolvedNode new. varDecl := (('var' asPParser trim: ignorable) , identifier , (($= asPParser trim: ignorable) , expression) optional , ($; asPParser trim: ignorable)) ==> [ :node | node third ifNil: [ SPLDeclaration for: node second ] ifNotNil: [ SPLInitializedDeclaration for: node second with: node third second ] ]. parenthesizedExpression := ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable). primary := parenthesizedExpression / boolean / number / string / identifier. negatedUnary := ($! asPParser trim: ignorable) / ($- asPParser trim: ignorable) , unary. unary := PP2UnresolvedNode new. unary def: negatedUnary / primary. factor := unary , (($/ asPParser trim: ignorable) / ($* asPParser trim: ignorable) , unary) star. term := factor , (('-' asPParser trim: ignorable) / ('+' asPParser trim: ignorable) , factor) star. comparison := term , (('>=' asPParser trim: ignorable) / ('>' asPParser trim: ignorable) / ('<=' asPParser trim: ignorable) / ('<' asPParser trim: ignorable) , term) star. equality := comparison , (('!!=' asPParser trim: ignorable) / ('==' asPParser trim: ignorable) , comparison) star. logicAnd := equality , (('and' asPParser trim: ignorable) , equality) star. logicOr := logicAnd , (('or' asPParser trim: ignorable) , logicAnd) star. assignmentExpression := (identifier trim: ignorable) , ($= asPParser trim: ignorable) , assignment. assignment := assignmentExpression / logicOr. expression def: assignment. exprStmt := expression , ($; asPParser trim: ignorable). printStmt := ('print' asPParser trim: ignorable) , expression , ($; asPParser trim: ignorable). statement := PP2UnresolvedNode new. declaration := varDecl / statement. ifStmt := ('if' asPParser trim: ignorable) , ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable) , statement , (('else' asPParser trim: ignorable) , statement) optional. whileStmt := ('while' asPParser trim: ignorable) , ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable) , statement. block := ('{' asPParser trim: ignorable) , declaration star , ($} asPParser trim: ignorable). statement def: ifStmt / printStmt / whileStmt / exprStmt / block. program := declaration star end.
We can test that this will parse a simple SPL program.
program parse: '// My first SPL program var hello = "Hello world"; print hello;'
Extracting a parser class from a script
Normally we would turn parser into a class at an earlier stage to enable Example-driven development by example (essentially TDD using example methods). Instead we have developed the full parser as a script, and then we will also turn our scripted tests into examples.
To turn the script into a class, we need:
1. a class MyParser
that subclasses PP2CompositeNode
2. for every grammar rule x
in the script, there must be
both
a method called x
returning that parser, and a slot named x
that will store an instance of the parser
3. there must be a start
rule and slot for the root of the grammar.
This can be created manually, but the class can also be extracted automatically from the script. However, the script must only contain rule definitions, and there may be no references to PP2UnresolvedNode
or any other external variables or classes.
This is relatively easy to do. First we evaluate the original version of the script, to initialize all rule variables. Then we copy the script, comment out or remove the lines that initialize recursive rules to PP2UnresolvedNode
, and replace the def:
sends by assignments.
Here's the result for our script:
comment := '//' asPParser , #newline asPParser negate star. ignorable := (comment / #space asPParser) star. boolean := ('true' asPParser trim: ignorable) / ('false' asPParser trim: ignorable). integer := #digit asPParser plus , $. asPParser not trim: ignorable. float := $- asPParser optional , #digit asPParser plus , $. asPParser , #digit asPParser plus trim: ignorable. number := integer / float. string := $" asPParser , $" asPParser negate plus , $" asPParser. keyword := ('var' asPParser , #letter asPParser not) / ('if' asPParser , #letter asPParser not) / ('else' asPParser , #letter asPParser not) / ('while' asPParser , #letter asPParser not) / ('true' asPParser , #letter asPParser not) / ('false' asPParser , #letter asPParser not) / ('and' asPParser , #letter asPParser not) / ('or' asPParser , #letter asPParser not). identifier := keyword not , #letter asPParser , #word asPParser star. varDecl := (('var' asPParser trim: ignorable) , identifier , (($= asPParser trim: ignorable) , expression) optional , ($; asPParser trim: ignorable)) ==> [ :node | node third ifNil: [ SPLDeclaration for: node second ] ifNotNil: [ SPLInitializedDeclaration for: node second with: node third second ] ]. parenthesizedExpression := ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable). primary := parenthesizedExpression / boolean / number / string / identifier. unary := negatedUnary / primary. negatedUnary := ($! asPParser trim: ignorable) / ($- asPParser trim: ignorable) , unary. factor := unary , (($/ asPParser trim: ignorable) / ($* asPParser trim: ignorable) , unary) star. term := factor , (('-' asPParser trim: ignorable) / ('+' asPParser trim: ignorable) , factor) star. comparison := term , (('>=' asPParser trim: ignorable) / ('>' asPParser trim: ignorable) / ('<=' asPParser trim: ignorable) / ('<' asPParser trim: ignorable) , term) star. equality := comparison , (('!!=' asPParser trim: ignorable) / ('==' asPParser trim: ignorable) , comparison) star. logicAnd := equality , (('and' asPParser trim: ignorable) , equality) star. logicOr := logicAnd , (('or' asPParser trim: ignorable) , logicAnd) star. assignmentExpression := (identifier trim: ignorable) , ($= asPParser trim: ignorable) , assignment. assignment := assignmentExpression / logicOr. expression := assignment. exprStmt := expression , ($; asPParser trim: ignorable). printStmt := ('print' asPParser trim: ignorable) , expression , ($; asPParser trim: ignorable). declaration := varDecl / statement. ifStmt := ('if' asPParser trim: ignorable) , ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable) , statement , (('else' asPParser trim: ignorable) , statement) optional. whileStmt := ('while' asPParser trim: ignorable) , ($( asPParser trim: ignorable) , expression , ($) asPParser trim: ignorable) , statement. block := ('{' asPParser trim: ignorable) , declaration star , ($} asPParser trim: ignorable). statement := ifStmt / printStmt / whileStmt / exprStmt / block. program := declaration star end.
Now we can right-click and select Extract PetitParser class
. We give a name to the class, apply the generated refactoring, and obtain something like this:
parser := SPLGrammar new.
Note that the extraction will automatically introduce a start
method for the last production in the script. SPLGrammar>>#start
We verify that the extracted class works as before.
parser parse: '// My first SPL program var hello = "Hello world"; print hello;'