Querying SmaCC abstract syntax trees (ASTs)
SmaCC is a parsing engine. It comes out of the box with various parsers (see How to parse sources in different languages)..
One use of it is for code analysis purposes.
The result of parsing is an abstract syntax tree AST Let's consider this JavaScript source code (on the left) and its associated AST (on the right). Click around to navigate between the two perspectives.
The AST is, like its name says, a tree. In SmaCC, all nodes inherit from SmaCCParseNode
. This class provides an interesting query API.
For example, let's assume you want to find all the defined functions from the AST. You can just use SmaCCParseNode>>#//
:
jsFunctions <gtExample> | functions | functions := self jsAST // 'function'. self assert: functions size = 2. self assert: functions notEmpty. self assert: functions isEmpty not. self assert: (functions allSatisfy: [ :each | each isKindOf: JSFunctionNode ]). ^ functions
The above queries looks for nodes in the tree that have the function
in their name. In this case, we will get instances of JSFunctionNode
. But sometimes we want more elaborate ways to search.
How about this: find all function calls that have as parameter a literal whose value equals 10
. The example below does exactly that:
jsCallsWith10AsArg <gtExample> | calls | calls := self jsAST // 'literal' @ 'value' <- '10' \\ 'callexpression'. self assert: calls size = 2. ^ calls
Let's unwrap what happens here.
First, we look for literals.
jsLiterals <gtExample> | calls | calls := self jsAST // 'literal'. self assert: calls size equals: 9. ^ calls
Then we look at literals with value 10
. We do this by composing the previous query with SmaCCParseNode>>#@
for getting the value attribute which is then combined with GtSmaCCAttributeQuery>>#<-
which checks for specific values.
jsLiteralsWithValue10 <gtExample> | calls | calls := self jsAST // 'literal' @ 'value' <- '10'. self assert: calls size = 2. ^ calls
This gives us two interesting literals, but we want the calls, not the literals. So, at the end of the query we traverse parents using SmaCCParseNode>>#\
.