A gentle introduction to Pharo Smalltalk

TL;DR

In this tutorial we learn the basics of Pharo Smalltalk using a running example of code to unscramble an English-language word.

Introduction

Pharo is a modern dialect of the classic Smalltalk-80 language and environment.

The key thing that distinguishes Smalltalk from other OO languages is that it is a live environment in which you interact with live objects inside a running image, and you incrementally add and modify classes and methods within a running system.

For a high-level overview of the run-time architecture of Smalltalk and Pharo, see: Pharo architecture.

In this tutorial we will directly run code snippets, without creating any new classes or methods.

The Jumble puzzle

The Jumble puzzle is a classic syndicated newspaper game in which four scrambled words have to be unscrambled to form ordinary English. Selected letters from the four answers form the scrambled answer to a punny (but not very funny) cartoon puzzle. We will solve the first part, which is to unscramble the words.

Unscrambling strategy: The idea is to look up the scrambled word in a dictionary whose keys consist of a canonical representation of each word as a bag of letters. The easiest such representation is the sorted list of letters in the word.

For example, to unscramble the letters 'gameses', we would sort the letters of the string to produce the key 'aeegmss' , and look up that key in our dictionary to find all the real words with the same key.

A pure Smalltalk solution

The following snippet solves the puzzle without defining any new classes or methods. Later we could implement a Jumble class that unscrambles words for us, but for now let's just work with a live snippet.

Have a quick look at the code, and run it by clicking on the Inspect icon:

You should be able to make sense of the code as a kind of pidgin English, even if the details will need some explanation.

wordlistUrl := 'https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt'.
words := (ZnUrl fromString: wordlistUrl) retrieveContents.
wordList := (Character cr asString , Character lf asString) split: words.
wordDict := Dictionary new.
wordList do: [:word |
	key := word sorted.
	wordDict
		at: key
		ifPresent: [:v | v addLast: word ]
		ifAbsentPut: [ { word } asOrderedCollection]
	].
wordDict at: 'gameses' sorted
  

After clicking on the Inspect button, you should see in the adjacent pane the two possible answers, namely "megasse" and "message".

Now let's look into the details of how it works.

Two rules about understanding Smalltalk

There are two basic rules to understanding Smalltalk code:

1. Everything is an object

and

2. Everything happens by sending messages

As in most languages, the code snippet above consists of a block of statements , each of which contains one or more nested expressions . Unlike other languages, the statements are separated by periods (.) For more details, see: Understanding Smalltalk method syntax. The expressions, by and large, consist of message sends . For the details, please see: Understanding Smalltalk message syntax.

For a compact overview, see: Smalltalk method syntax on a postcard.

Grabbing a list of words from the web

Before we start coding our solution, we search for an online list of English words, and find a pretty extensive one on github.

We first save the URL as a Smalltalk string.

wordlistUrl := 'https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt'.
  

Assignment to a variable uses the := operator, as opposed to = and ==, which are comparison messages.

Also note that a string in Smalltalk is delimited by single quotes ', not double quotes " (which delimit comments).

We did not end the snippet with a period because . is a statement separator in Smalltalk, not a terminator. We only need periods to separate two or more statements.

Next we retrieve the contents of the web page as a String:

words := (ZnUrl fromString: wordlistUrl) retrieveContents
  

Here we see two kinds of message sends. First we send the fromString: keyword message with the argument wordlistUrl to the ZnUrl Object subclass: #ZnUrl instanceVariableNames: 'scheme host port segments query fragment username password' classVariableNames: '' package: 'Zinc-Resource-Meta-Core' class to create an instance. Since everything is an object in Smalltalk, also classes are objects. Classes are globally accessible names and always start with an uppercase letter.

We then send the retrieveContents unary message to the resulting instance of ZnUrl. If you Inspect the above snippet, you will see that the result is an instance of ByteString String variableByteSubclass: #ByteString instanceVariableNames: '' classVariableNames: 'NonAsciiMap' package: 'Collections-Strings-Base' .

In addition to strings, Smalltalk provides built-in syntax for characters, literals, arrays, various kinds of numbers, and blocks (lambdas). For the details, see: Understanding Pharo built-in data types.

Splitting the result into a list of words

At this point we have a retrieved the word list as a giant string.

words
  

If you look at the Items tab you will also see that the words are all separated by a carriage return and a linefeed character (Character cr followed by Character lf).

We would like to split this huge string into a list of individual words that we can then use to build up a dictionary. To do this, we first build up a string from the carriage return and line feed characters. Since we can't type them directly, we send the instance creation unary methods cr and lf to the Character Magnitude immediateSubclass: #Character instanceVariableNames: '' classVariableNames: 'CharSet CharacterTable DigitValues' package: 'Kernel-BasicObjects' class (i.e., object), we convert each of these to a string by sending them the asString message, and then we concatenate them with the binary , message:

Character cr asString , Character lf asString
  

You can inspect the resulting string, but as it consists purely of whitespace, it is more informative to inspect the Items, Tree or Raw views.

Finally we ask this whitespace string to split the words string into a collection by sending it the split: keyword message.

wordList := (Character cr asString , Character lf asString) split: words.
  

The result is an instance of the OrderedCollection SequenceableCollection subclass: #OrderedCollection instanceVariableNames: 'array firstIndex lastIndex' classVariableNames: '' package: 'Collections-Sequenceable-Ordered' class.

Building the dictionary

Our next step is to build up a dictionary from the list of words. We first create a new Dictionary by sending the unary new message to the Dictionary HashedCollection subclass: #Dictionary instanceVariableNames: '' classVariableNames: '' package: 'Collections-Unordered-Dictionaries' class.

wordDict := Dictionary new
  

We then iterate over all the words in the wordList collection, adding each word to a list of words with the same key:

wordList do: [:word |
	key := word sorted.
	wordDict
		at: key
		ifPresent: [:v | v addLast: word ]
		ifAbsentPut: [ { word } asOrderedCollection]
	].
wordDict
  

This code requires some explanation.

Everything happens by sending messages

Here we see the power of the Smalltalk object model. Instead of the language providing built-in iterators, everything happens by sending messages . We send the keyword message do: to the wordList collection, with a block (lambda) as its argument.

A block has the general form:

[ : argument ... | statements ]

There may be any number of arguments (including none), and any number of statements. The value of a block, when evaluated is that produced by the last statement.

In this case the do: method will simply apply the block to each word in the list. For each word we sort the letters of the word to produce a key.

For example, 'gameses' sorted yields 'aeegmss':

'gameses' sorted
  

Then we send the wordDict dictionary the keyword message at:ifPresent:ifAbsentPut:, which takes three arguments, a key and two blocks. If the key is already present in the dictionary, we simply add the word to the collection stored at that key. If the key is new, we add a new collection containing just that word.

To learn more about different kinds of control structures provided by Smalltalk environment (as opposed to the language itself), see: Understanding Smalltalk control structures.

Different kinds of collections

Here we see the built-in Pharo syntax for an array:

{ 'howdy' }
  

Since we don't want an Array ArrayedCollection variableSubclass: #Array instanceVariableNames: '' classVariableNames: '' package: 'Collections-Sequenceable-Base' but an OrderedCollection SequenceableCollection subclass: #OrderedCollection instanceVariableNames: 'array firstIndex lastIndex' classVariableNames: '' package: 'Collections-Sequenceable-Ordered' , we send the message asOrderedCollection to the array.

There are many different kinds of collections in Pharo. To learn more about them, see: Working with collections in Pharo.

Working with the dictionary

If we inspect the resulting dictionary, we see that each key is a string of sorted characters, associated with a collection of English words containing the same characters.

wordDict
  

Now we can easily unscramble a word by sorting its characters to produce its key, and then looking the key up in the dictionary:

wordDict at: 'gameses' sorted
  

An exercise

How would you find the largest set of words all with the same letters?

Hint. To find the longest word in the dictionary, you can evaluate the following:

wordDict at: (wordDict keys sort: [:a :b | a size > b size ] ) first
  

Next: working with classes and methods

Working with snippets is fine, but ultimately working with Smalltalk means designing your own classes. To see how live programming differs from programming with conventional languages, see: A gentle introduction to classes and methods in Smalltalk, where we will turn this code into a simple class-based solution.