Introducing test examples and code cleaning

TL;DR

We refactor the Jumble puzzle class from A gentle introduction to classes and methods in Smalltalk to support alternative word lists, and we show how to add tests in the form of Example methods.

Getting started

We have previously seen in A gentle introduction to Pharo Smalltalk how to write an English word unscrambler using code snippets, and in A gentle introduction to classes and methods in Smalltalk we saw how to turn the snippets into classes and methods.

Our solution so far hardwires the URL of the source word list, and we have no tests.

In case you didn't just do the previous tutorial, you can load the changes now by clicking on the checkmark.

Object subclass: #Jumble
	instanceVariableNames: 'wordDict'
	classVariableNames: ''
	package: 'Tutorial-Jumble'.

Jumble class
	instanceVariableNames: ''

A Jumble is an object that knows how to unscramble English words:

	Jumble new unJumble: 'gameses'

It encapsulates a dictionary of lists of words, keyed by a sorted string of characters in each word. It unjumbles a word by sorting its characters to form a key and looking up the key in the dictionary.

"protocol: #accessing"

Jumble >> initialize

	| wordlistUrl words wordList |
	wordlistUrl := 'https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt'.
	words := (ZnUrl fromString: wordlistUrl) retrieveContents.
	wordList := Character cr asString , Character lf asString split:
		            words.
	wordDict := Dictionary new.
	wordList do: [ :word | 
		| key |
		key := word sorted.
		wordDict
			at: key
			ifPresent: [ :v | v addLast: word ]
			ifAbsentPut: [ { word } asOrderedCollection ] ]


"protocol: #accessing"

Jumble >> unJumble: aString

	^ wordDict at: aString sorted
  

Example methods and tests

Testing in GT requires a paradigm shift. Instead of writing “test methods”, in GT we write example methods .

Whereas test methods in most testing frameworks produce no result on completion, an example method always produces an example object. This has several advantages:

1. You can compose example methods to produce more complex examples.

2. You can inspect an example and interact with it.

3. Example methods can be used to illustrate concrete scenarios of how to create and use the tested classes.

An example method in GT is an ordinary method that (i) is annotated with a <gtExample> pragma, (ii) may contain assertions (tests), and (iii) returns an example instance.

Here's a simple example method for our Jumble class:

"protocol: #example"

Jumble >> defaultExample

	<gtExample>
	| jumble |
	jumble := Jumble new.
	self assert: (jumble unJumble: 'gameses') size = 2.
	self assert: (jumble unJumble: 'gameses') second = 'message'.
	^ jumble
  

We can run the example as follows:

Jumble new defaultExample
  

There are also tools to run all the examples of a class or a package, and tools to explore the “map” of composed examples for a class or package.

Refactoring strategy

In this tutorial we will refactor the Jumble class step-by-step to support alternative word lists.

First we will introduce an example method illustrating how we can instantiate a new Jumble from a different source URL.

We will then introduce a slot called wordListUrl to store this location.

We will introduce constant methods to store the alternative URLs.

Since the URL may be set after the creation of the Jumble, we will introduce a lazy accessor to initialize the wordDict slot when we first need it.

To support the lazy initialization, we will factor out the code to initialize the wordDict slot.

Since alternative source files may use different characters to separate the words, we will add a method to compute the line separators used in the source word list.

Adding an alternative example

Let's start by introducing a new example method that illustrates how we can instantiate a Jumble from a different word list.

We will introduce a class-side method Jumble class>>#from: to create a new instance from a given URL.

Note that this word list only has one possible unscrambling of 'gameses', in contrast to the list we used before.

In Smalltalk, we use methods to define constants.

"protocol: #example"

Jumble >> altExample

	<gtExample>
	| jumble |
	jumble := Jumble from: Jumble altUrl.
	self assert: (jumble unJumble: 'gameses') size = 1.
	self assert: (jumble unJumble: 'gameses') first = 'message'.
	self assert: (jumble unJumble: 'xyz') isEmpty.
	^ jumble
  
"protocol: #constants"

Jumble class >> altUrl

	^ 'https://raw.githubusercontent.com/jeremy-rifkin/Wordlist/master/res/c.txt'
  

We'll also introduce a constant method for the default list we used before.

"protocol: #constants"

Jumble >> defaultUrl

	^ 'https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt'
  

Adding a slot

Since we will initialize our word dictionary lazily, we need to store the URL in slot so we can find it when we need it.

Object subclass: #Jumble
	instanceVariableNames: 'wordDict wordlistUrl'
	classVariableNames: ''
	package: 'Tutorial-Jumble'.

Jumble class
	instanceVariableNames: ''
  

Initialization

In this new version of Jumble, we only need to set a default value for the wordListUrl.

"protocol: #initialization"

Jumble >> initialize

	self wordlistUrl: self defaultUrl
  

We'll need a setter. In Smalltalk, getters are named the same as the slot whose value they return, and setters are the same, but with a colon as a suffix, turning them into a one-argument keyword message.

"protocol: #initialization"

Jumble >> wordlistUrl: aString

	wordlistUrl := aString
  

Class-side constructor

Instance creation messages are sent to the class, so they must be implemented as class-side methods. Here we introduce #from: as a message understood by the Jumble class itself.

Note that self in this method refers to the Jumble class, not an instance of Jumble.

"protocol: #'instance creation'"

Jumble class >> from: aWordListUrl

	| aJumble |
	aJumble := self new.
	aJumble wordlistUrl: aWordListUrl.
	^ aJumble
  

Lazy wordList accessor

We want to initialize the wordDict only when we need it, so we introduce a getter that initializes the wordDict slot if its value is nil.

"protocol: #accessing"

Jumble >> wordDict

	wordDict ifNil: [ self initializeWordDict ].
	^ wordDict
  

We refactor the initilialization code into a dedicated method.

"protocol: #initialization"

Jumble >> initializeWordDict

	| words separator wordList |
	wordDict := Dictionary new.

	words := (ZnUrl fromString: wordlistUrl) retrieveContents.

	separator := self lineSeparatorFor: words.

	wordList := separator split: words.
	wordList do: [ :word | 
		| key |
		key := word sorted.
		wordDict
			at: key
			ifPresent: [ :v | v addLast: word ]
			ifAbsentPut: [ { word } asOrderedCollection ] ]
  

Finding the line separator

The only tricky bit is that the word lists we use as sources may adopt different conventions for separating lines of words.

We'll adopt a simple heuristic that searches for the three most common alternatives in the first 100 characters of the file.

"protocol: #private"

Jumble >> lineSeparatorFor: words

	"Search for the line separator in the first 100 characters of words."

	| fragment cr lf crlf |
	fragment := words copyFrom: 1 to: 100.

	cr := Character cr asString.
	lf := Character lf asString.
	crlf := cr , lf.

	(fragment includesSubstring: crlf) ifTrue: [ ^ crlf ].
	(fragment includesSubstring: cr) ifTrue: [ ^ cr ].
	(fragment includesSubstring: lf) ifTrue: [ ^ lf ].

	self error: 'Couldn''t find line separator!'
  

Unjumble via the lazy accessor

Now we need to adapt the unJumble: method to use the lazy getter instead of directly accessing the wordDict slot.

"protocol: #accessing"

Jumble >> unJumble: aString

	^ self wordDict at: aString sorted ifAbsent: OrderedCollection new
  

Using the alternative Jumble

Now we are done!

Jumble new altExample unJumble: 'aaeeilnrttv'
  

What's next

Have a closer look at Example-driven development by example.