Querying classes and methods

TL;DR

This page provides an overview of ways to reflectively query classes and methods.

What is a class? — Here we summarize the key aspects of classes.

Navigating classes and methods — Here we summarize the relationships between the key classes used to implement classes and methods, and we see how to navigate amongst them.

Querying classes — We look at examples of some of the key API methods of classes.

Filters — We provide a brief summary of how GT filters can be used to query code.

Querying methods — We see how to query methods both using the CompiledMethod API and the RBNode AST API. We also see how to perform methods reflectively.

Smalltalk and SystemNavigation — These are two utilities for reflectively querying and navigating a Smalltalk system.

Example: Finding smelly super-sends — We conclude with a small case study in which we reflectively query the Smalltalk system to find methods with a particular code smell.

What is a class?

Like objects, classes have three key aspects in Smalltalk.

1. A class is a format , or template, for creating instances.

2. A class holds a method dictionary for the shared behavior of its instances.

3. A class can be a superclass creating subclasses.

Navigating classes and methods

In order to effectively query Smalltalk source code, it helps to understand the following diagram. At the left we see that Class ClassDescription subclass: #Class instanceVariableNames: 'subclasses name classPool sharedPools environment category' classVariableNames: '' package: 'Kernel-Classes' and Metaclass ClassDescription subclass: #Metaclass instanceVariableNames: 'thisClass' classVariableNames: '' package: 'Kernel-Classes' are in a common hierarchy and inherit from Behavior Object subclass: #Behavior instanceVariableNames: 'superclass methodDict format layout' classVariableNames: 'ClassProperties ObsoleteSubclasses' package: 'Kernel-Classes' . (Recall from Understanding Smalltalk classes and metaclasses that every object is an instance of a class, and every class is an instance of its (unique) metaclass. )

From a class we can navigate to its MethodDictionary Dictionary variableSubclass: #MethodDictionary instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Methods' , and from there to each CompiledMethod CompiledCode variableByteSubclass: #CompiledMethod instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Methods' defined in that class. Furthermore, we can navigate from a method to its AST (Abstract Syntax Tree), which is represented as a tree of RBProgramNode RBNode subclass: #RBProgramNode instanceVariableNames: 'parent properties' classVariableNames: 'FormatterClass' package: 'AST-Core-Nodes' instances. Each of these classes provides a rich API for querying classes or methods.

We can see this in action. Here we retrieve the method dictionary of the GtLudoGame Object subclass: #GtLudoGame instanceVariableNames: 'players squares startSquares goalSquares die announcer feedback winner needToRollDie lastDieRolled playerQueue routeCache' classVariableNames: '' package: 'GToolkit-Demo-Ludo-Model' .

GtLudoGame methodDict.

We can retrieve a specific method, such as currentPlayer:

GtLudoGame methodDict at: #currentPlayer.

Note that there is a convenient shortcut for retrieving a method from a class, namely:

GtLudoGame >> #currentPlayer.

If you want to search for a method in the superclass chain, then you can send lookupSelector: instead:

GtLudoGame lookupSelector:  #class.

You can also directly get the methods of a class:

GtLudoGame methods.

From a method we can always get back to the class in which it was defined.

(GtLudoGame >> #currentPlayer) methodClass.

Note that you need to send methodClass, and not class! (Before evaluating the following snippet, what do you suppose it will return?)

(GtLudoGame >> #currentPlayer) class.

From a method we can also get to its AST:

(GtLudoGame >> #currentPlayer) ast.

The class RBMethodNode RBProgramNode subclass: #RBMethodNode instanceVariableNames: 'scope selector keywordsPositions body source arguments pragmas replacements nodeReplacements compilationContext bcToASTCache' classVariableNames: '' package: 'AST-Core-Nodes' is quite interesting, as it has numerous useful testing and querying methods, such as RBMethodNode>>#hasArguments hasArguments ^ arguments isNotEmpty and RBProgramNode>>#superMessages superMessages ^(self sendNodes select: [ :node | node isSuperSend ] thenCollect: [ :node | node selector ]) asSet . We'll see some examples later.

Each of the classes in the diagram above has a rich API for introspection. You will have to explore them, but as a general rule, if there is some information you would like to get, there is probably a method for it.

Querying classes

Let's have a closer look at querying classes. These methods are all defined in one of the classes Class ClassDescription subclass: #Class instanceVariableNames: 'subclasses name classPool sharedPools environment category' classVariableNames: '' package: 'Kernel-Classes' , ClassDescription Behavior subclass: #ClassDescription instanceVariableNames: 'organization commentSourcePointer' classVariableNames: '' package: 'Kernel-Classes' or Behavior Object subclass: #Behavior instanceVariableNames: 'superclass methodDict format layout' classVariableNames: 'ClassProperties ObsoleteSubclasses' package: 'Kernel-Classes' . You may like to try to guess where each one is defined. (Click on the grey triangle to open the method and see the class where it is defined.)

Here is a sample of some common queries:

Collection allSuperclasses.

OrderedCollection selectors.

Collection allSelectors.

OrderedCollection allInstVarNames.

OrderedCollection slots.

Collection subclasses.

Collection allSubclasses.

Collection linesOfCode.

OrderedCollection allInstances .

OrderedCollection canUnderstand: #class.

Filters

In addition to Smalltalk's reflective API, GT also offers Filters as a composable way to search over code in the system. For example, if we want to see how columned list Inspector views are implemented, we can compose the query:

#gtView gtPragmas & #columnedList gtSenders.

The resulting view is lazily computed. To see more examples, consult Querying Pharo code with GT filters.

Sometimes filters are not enough, and you may need to combine them with other reflective calls. Then you can send contents to a filter to get the actual collection of results.

For example, if we want to know which view implementations perform super sends, we could compose the following query:

#gtView gtPragmas contents select: #sendsToSuper

Querying methods

There are two main APIs of interest for querying methods: CompiledMethod CompiledCode variableByteSubclass: #CompiledMethod instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Methods' and the RBNode Object subclass: #RBNode instanceVariableNames: '' classVariableNames: '' package: 'AST-Core-Nodes' AST hierarchy, in particular RBMethodNode RBProgramNode subclass: #RBMethodNode instanceVariableNames: 'scope selector keywordsPositions body source arguments pragmas replacements nodeReplacements compilationContext bcToASTCache' classVariableNames: '' package: 'AST-Core-Nodes' .

Querying CompiledMethod

CompiledMethod CompiledCode variableByteSubclass: #CompiledMethod instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Methods' offers a very rich API. See in particular the testing protocol which supports numerous is* methods.

(GtLudoGame >> #gameState) isAbstract.

(GtLudoGame >> #gameState) isDeprecated.

There are also other methods to access information about methods in the ccessing and source code management protocols.

(GtLudoGame >> #gameState) sourceCode.

(GtLudoGame >> #gameState) linesOfCode.

Here's an example where we compute the total source lines of code of the Ludo game package:

(GtLudoGame package methods collect: #linesOfCode) sum.

There are more useful methods in CompiledCode ByteArray variableByteSubclass: #CompiledCode instanceVariableNames: '' classVariableNames: 'LargeFrame PrimaryBytecodeSetEncoderClass SecondaryBytecodeSetEncoderClass SmallFrame' package: 'Kernel-Methods' , the direct superclass of CompiledMethod CompiledCode variableByteSubclass: #CompiledMethod instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Methods' . Here we find all the methods in the Ludo package that send to super, using CompiledCode>>#sendsToSuper sendsToSuper self localSendsToSuper ifTrue: [ ^ true ]. ^ self innerCompiledBlocksAnySatisfy: [ :cb | cb sendsToSuper ] .

GtLudoGame package methods select: #sendsToSuper

Querying AST nodes

RBMethodNode RBProgramNode subclass: #RBMethodNode instanceVariableNames: 'scope selector keywordsPositions body source arguments pragmas replacements nodeReplacements compilationContext bcToASTCache' classVariableNames: '' package: 'AST-Core-Nodes' is the AST node for methods, and is the root node of the AST for any method. This may be needed to access more fine-grained information than is available from CompiledMethod CompiledCode variableByteSubclass: #CompiledMethod instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Methods' .

(GtLudoRecordingGame >> #updateOnRoll) ast.

As with compiled methods, you can get back to the class by sending methodClass (and to the method by sending method).

(GtLudoRecordingGame >> #updateOnRoll) ast methodClass = GtLudoRecordingGame.

CompiledMethod also has many methods in the *Reflectivity protocol that will can query the AST of a method and return just select nodes of interest.

(GtLudoRecordingGame >> #updateOnRoll) sendNodes.

Each subclass of RBNode supports dedicated accessors. For example, here are the messages sent by this method:

(GtLudoRecordingGame >> #updateOnRoll) sendNodes collect: #selector.

Here are all the messages sent by methods in the Ludo game package:

(GtLudoGame package methods 
	flatCollect: [:m | m sendNodes collect: #selector ]) asSet.

See the testing protocol of RBMethodNode to find many useful queries.

NB: The class comments of the RBNode Object subclass: #RBNode instanceVariableNames: '' classVariableNames: '' package: 'AST-Core-Nodes' hierarchy provide much useful documentation. Here's a reflective query to extract all those comments:

(RBNode withAllSubclasses
	select: [ :class | 
		class package = RBNode package 
		and: [ class tags = RBNode tags ] ])
	collect: #comment.

Performing a method.

You can evaluate a method reflectively by sending it #valueWithReceiver:arguments:.

method := Integer>>#factorial.
method valueWithReceiver: 5 arguments: #()

method := DemoSlideshow>>#coder:.
method
	valueWithReceiver: method methodClass new
	arguments: {GtProtoLiveSlide new}.

You can send a message reflectively by sending perform: or perform:with:.

5 perform: #factorial.

DemoSlideshow new perform: #coderExampleSlideFor: with:  GtProtoLiveSlide new.

Of course there exists a whole series of perform:* methods for different numbers of arguments.

Smalltalk and SystemNavigation

There are two further metaobjects that are useful for querying the system.

One is Smalltalk, the unique instance of SmalltalkImage Object subclass: #SmalltalkImage instanceVariableNames: 'globals specialObjectsArray vm' classVariableNames: 'CompilerClass LastImagePath LastQuitLogPosition LogFileName LowSpaceProcess LowSpaceSemaphore MemoryHogs SourceFileVersionString SpecialSelectors Tools' package: 'System-Support-Image' . Here's how to get all the classes, traits and packages of the system:

Smalltalk allClasses.

Smalltalk allTraits.

Smalltalk packages.

To get all the methods, you have to work harder:

Smalltalk allClassesAndTraits flatCollect: #methods.

And here's a platform-independent way of getting the current compiler:

Smalltalk compiler.

Smalltalk compiler evaluate: '3 + 4'

There is also SystemNavigation Object subclass: #SystemNavigation instanceVariableNames: 'environment' classVariableNames: '' package: 'System-Support-Image' a facade of utility methods for navigating a Smalltalk system. See the query protocol.

NB: some of these methods overlap with the methods offered by SmalltalkImage Object subclass: #SmalltalkImage instanceVariableNames: 'globals specialObjectsArray vm' classVariableNames: 'CompilerClass LastImagePath LastQuitLogPosition LogFileName LowSpaceProcess LowSpaceSemaphore MemoryHogs SourceFileVersionString SpecialSelectors Tools' package: 'System-Support-Image' .

SystemNavigation default  allClassesAndTraits.

Yet there are other utilities only found here:

SystemNavigation default  allHalts.

Caveat: the browse* methods in the *Tool-Base protocol are not terribly useful in GT, as they attempt to open a Morphic browser in the Pharo Morphic world. You can only see the result of this query by opening a Morphic World.

SystemNavigation default 
	browseAllImplementorsOf: #superSends.

In most cases, you can use GT filters instead.

#superSends gtImplementors.

Example: Finding smelly super-sends

Let's have a look at a slightly more complicated use case.

There is a well-known anti-pattern in OOP, which is related to the use of super. Super-sends are intended to be used when a subclass wants to override and extend the implementation of a method inherited from a superclass. The new method m can then perform super m to reuse the overridden behavior. Now, the anti-pattern is this: there is no reason to perform a super send of a different method, say m1. Instead of performing super m1, it would be better to perform self m1, else problems may arise in subclasses that in turn implement or override m1.

Our task is to use the reflective API of Smalltalk to find such methods in the Collection Object subclass: #Collection instanceVariableNames: '' classVariableNames: '' package: 'Collections-Abstract-Base' hierarchy.

First, we want to find all the methods in the Collection hierarchy that send to super. This is straightforward:

superSends :=
	(Collection withAllSubclasses 
		flatCollect: #methods)
			select: #sendsToSuper.

Aside: How do you find out that sendsToSuper is the right way to get this information? Just inspect any coompiled method instance and go to the Meta view (alternatively browse CompiledMethod CompiledCode variableByteSubclass: #CompiledMethod instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Methods' in the Coder), and use the Spotter to query “super”. You will see all the methods with “super” in the name — there are not many — and quickly find CompiledCode>>#sendsToSuper sendsToSuper self localSendsToSuper ifTrue: [ ^ true ]. ^ self innerCompiledBlocksAnySatisfy: [ :cb | cb sendsToSuper ] .

Next we want to find all the send nodes in the ASTs of these methods. These are instances of RBMessageNode RBValueNode subclass: #RBMessageNode instanceVariableNames: 'receiver selector keywordsPositions arguments' classVariableNames: '' package: 'AST-Core-Nodes' , and we can get them from the AST (or directly from the methods) by sending sendNodes (CompiledMethod>>#sendNodes sendNodes ^self ast sendNodes ).

We ask the node if it is a supersend, and if so, check if the selector differs from that of the method. If a method has any such sends, then it has the code smell.

superSends := (Collection withAllSubclasses
		flatCollect: #methods)
			select: #sendsToSuper.

superSends
	select: [ :m | 
		(m sendNodes
			select: [ :send | send isSuperSend 
				and: [ m selector ~= send selector ] ])
			isNotEmpty ].

We find five such methods. We can inspect the results to verify whether the code smells are problematic or not.

We can also query the entire image for such methods, and we find quite a few of them:

((Smalltalk allClassesAndTraits flatCollect: #methods) select: #sendsToSuper)
	select: [ :m | 
		(m sendNodes
			select: [ :send | send isSuperSend and: [ m selector ~= send selector ] ])
			isNotEmpty ].