Understanding Pharo built-in data types

TL;DR

We provide an overview of the built-in data types with dediacted syntax provided by Pharo, namely booleans, characters, strings, symbols, arrays and blocks.

Pseudo variables

Smalltalk has just six reserved keywords, each of which represents a particular object value.

nil represents the default value of an uninitialized variables, and is the unique instance of the class UndefinedObject Object subclass: #UndefinedObject instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Objects' . You can test if a variable is nil bey sending it #isNil:

nil isNil
  

true and false are booleans, described next.

self and super are used in methods, and both represent the receiver. The difference is the way in which methods are looked up when they are sent messages. For details, see Understanding self and super.

thisContext represents the reification of the run-time stack. For details, see Understanding reflection.

Booleans

In Smalltalk, everything is an object, even booleans.

true and false are the unique instances, respectively, of the classes True Boolean subclass: #True instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Objects' and False Boolean subclass: #False instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Objects' , both subclasses of Boolean Object subclass: #Boolean instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Objects' :

Boolean Object subclass: #Boolean instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Objects' consists mostly of abstract methods that are differently implemented by each of its subclasses, and generic methods that are the same for both.

Have a look at the different & method implementations of true and false (click on the grey triangles):

true & false
  
false & true
  

Numbers

Numbers are also organized as a hierarchy, with Number Magnitude subclass: #Number instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Numbers' as the abstract root of the hierarchy:

Smalltalk provides dedicated syntax for several kinds of numbers. Integers can be expressed either in plain decimal or radix syntax. Here we see decimal as well as binary and hexidecimal notations:

42 = 2r101010
  
42 = 16r2a
  

The maximum size of a SmallInteger Integer immediateSubclass: #SmallInteger instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Numbers' is:

SmallInteger maxVal
  

However, if you increment this value, Pharo automatically converts it to a LargeInteger Integer variableByteSubclass: #LargeInteger instanceVariableNames: '' classVariableNames: '' package: 'Kernel-Numbers' :

SmallInteger maxVal + 1
  

Decrementing this value will, of course, result in a SmallInteger again.

Smalltalk can happily deal with very large integers:

1000 factorial asString size
  

Floats can be expressed either using plain floating point or scientific notation:

12345.678 = 1.2345678e4
  

There is no dedicated syntax for fractions. Instead the factory method / is used as a conmvenient way to create them:

3 / 4 = (Fraction numerator: 3 denominator: 4)
  

Characters, Strings and Symbols

Characters are prefeixed by a dollar sign ($), while strings are dekimited by single quotes ('):

$h = 'hello' first
  

Untypeable characters can be created by sending dedicated instance creation methods to the Character Magnitude immediateSubclass: #Character instanceVariableNames: '' classVariableNames: 'CharSet CharacterTable DigitValues' package: 'Kernel-BasicObjects' class:

'hi', Character space asString, 'there'
  

Note that we can concatenate strings by sending the binary #, message, while characters can be converted to strings by sending the #asString message.

We can compare the values of two objects with the #= message, and their identity with the #== message. Two strings can have the same value but be two separate objects:

'smalltalk' = ('small' , 'talk') "we have the same value"
  
('smalltalk' == ('small' , 'talk')) not "but different identity"
  

Symbols are similar to strings, but start with a hash (#). They have the added property that they are globally unique.

#smalltalk == ('small' , 'talk') asSymbol "there is only one of us"
  

Literal and dynamic arrays

Arrays are a very basic form of Sequenceable Collection, as seen in this extract of the Collection Object subclass: #Collection instanceVariableNames: '' classVariableNames: '' package: 'Collections-Abstract-Base' hierarchy:

If you need a list of objects, you will normally work with a growable OrderedCollection SequenceableCollection subclass: #OrderedCollection instanceVariableNames: 'array firstIndex lastIndex' classVariableNames: '' package: 'Collections-Sequenceable-Ordered' , but fixed-size Arrays are also convenient as Pharo provides dedicated syntax for both literal and dynamic arrays.

Literal arrays are computed at parse time. They are expressed as a hash (#) followed by a list of literal values enclosed in parentheses:

#( 1 2 'hello' #($a $b) 3 / 4)
  

Note that, although we can have nested literal arrays, no expressions are dynamically evaluated, so 3, /, and 4 are treated as three separate literals.

#( 1 2 'hello' #($a $b) 3 / 4) size = 7
  

Dynamic arrays are evaluated at run time. They consist of a sequence of expressions delimited by curly braces and separated by periods:

{ 1 . 2 . 'hello' . #($a $b) . 3 / 4 }
  
{ 1 . 2 . 'hello' . #($a $b) . 3 / 4 } size = 5
  

It is common to use array syntax to build a list, and then send it #asOrderedCollection to convert it:

#( 1 1 2 3 5 8) asOrderedCollection
  

Blocks

Blocks are Smalltalks anonymous functions, or lambdas . They are expressed as a sequence of statements enclosed in square brackets.

A block is evaluated by sending it the message #value:

[ 3 + 4 ] value = 7
  

Blocks may also declare a number of formal parameters, as well as local temporary variables:

sum := [ :x :y | "two arguments"
			| z | "a local temporary variable"
			z := x + y.
			z "the value of the last expression is returned"
			].
sum value: 3 value: 4
  

To evaluate a block with one argument, send it #value:. To evaluate one with two arguments, send it #value:value:, and so on.

A block is a closure , i.e., it may capture any variables in its lexical environment. Even if the value of the block is passed out to other environments, it continues to hold references to any of the captuired variables:

x := 1.
inc := [ x := x + 1 ]. "capture x from the environment"
x assert: x = 1. "so far no side effect"
inc value. "update the captured x"
x assert: x = 2 "now we see the change"
  

Caveat: Although the return operator (^) may be used within a block, its effect is not to return from the block, but to return from the enclosing method.