Iceberg model and inner workings
Details about the inner working of Iceberg to understand how it commits, merges and handles conflicts.
Basic Iceberg Model
Iceberg contains multiple IceRepository
Object subclass: #IceRepository
instanceVariableNames: 'name workingCopy index commitsInPackageCache'
classVariableNames: 'Registry RepositoryClass'
package: 'Iceberg-Core'
. They are stored inside a registry.
icebergRepository := IceRepository registry
detect: [ :each | each name = 'gtoolkit' ].
A repository has a IceWorkingCopy
Object subclass: #IceWorkingCopy
instanceVariableNames: 'repository packages referenceCommit shouldIgnoreNotifications project properties'
classVariableNames: ''
package: 'Iceberg-Core'
that manage the status of all code loaded from the repository. It keeps track of what is loaded into the image (packages, comits, etc).
workingCopy := icebergRepository workingCopy
The working copy maintains list of IcePackage
Object subclass: #IcePackage
instanceVariableNames: 'package repository isDirty'
classVariableNames: ''
package: 'Iceberg-Core'
from the repo.
A working copy can also be in several states. The state is computed using IceWorkingCopy>>#workingCopyState
workingCopyState
"The working copy can be in different states depending on the repository and the package. It is the working copy state reponsibility to decide wether we can commit, if we are on a merge, and so on... The working copy state can be obtained through the message #workingCopyState.
workingCopy workingCopyState.
The working copy state is calculated every time that it is called. This is because the state of the repository can be modified from outside the system (e.g., the command line or another tool). In any case, calculating the working copy state is fast enough to be executed on-line even for big repositories such as Pharo's. The working copy state is calculated from the status of each of its packages. It was decided like this because it may happen that somebody downloads a package from different commits. If this situation changes in the future, this is a good point for simplification."
"This method obtains the head commit once and sends it as argument as an optimization.
This is because asking for the head commit is expensive.
Check the commits of #packageState"
referenceCommit isCollection
ifTrue: [ ^ IceInMergeWorkingCopy repository: repository ].
referenceCommit isUnknownCommit
ifTrue: [ ^ IceUnknownVersionWorkingCopy repository: repository ].
referenceCommit isNoCommit
ifTrue: [ ^ IceEmptyWorkingCopy repository: repository ].
^ IceAttachedSingleVersionWorkingCopy repository: repository
.
workingCopy workingCopyState
States are subclasses of IceWorkingCopyState
Object subclass: #IceWorkingCopyState
instanceVariableNames: 'repository'
classVariableNames: ''
package: 'Iceberg-WorkingCopy'
IceInMergeWorkingCopy
IceWorkingCopyState subclass: #IceInMergeWorkingCopy
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-WorkingCopy'
: Indicates that a merge is in progress
IceUnknownVersionWorkingCopy
IceWorkingCopyState subclass: #IceUnknownVersionWorkingCopy
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-WorkingCopy'
: indicates that the reference commit is an unknown commit.
IceEmptyWorkingCopy
IceWorkingCopyState subclass: #IceEmptyWorkingCopy
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-WorkingCopy'
: indicates that the referece commit is a IceNoCommit
IceCommitish subclass: #IceNoCommit
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-Core'
IceAttachedSingleVersionWorkingCopy
IceWorkingCopyState subclass: #IceAttachedSingleVersionWorkingCopy
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-WorkingCopy'
: indicates that the reference commit is a valid commit.
Overall a repository can be in several states:
- Unknown commit: The head commit of the repository is IceUnknownCommit
IceCommitish subclass: #IceUnknownCommit
instanceVariableNames: 'id datetime'
classVariableNames: ''
package: 'Iceberg-Core'
. This can be a repository created as a placeholder. A fetch could be required to load the actual repository
- Detached Working Copy: The head commit in the repository is not the same as the head commit inside the image
- Detached HEAD: The head of the repisitory is a commit instead of a branch. When cloning a repository, if the latest version is cloned, head is set to a branch. If a particular commit is cloned, then the head points to that commit, and Iceberg considers the repository detached.
- No Project Found: Missing setting for configuring the location of the code and its format
-Not loaded: No code loaded
- Uncommited changes with outgoing or incomming commits.
Creating diffs
Changes are committed bases on IceDiff
Object subclass: #IceDiff
instanceVariableNames: 'tree source target writerClass mergedTree'
classVariableNames: ''
package: 'Iceberg-Changes'
. A diff contains the changes between two Iceberg commitish. This can be a diff between two commits, or a diff between the current working copy and a commit.
workingCopy diffToReferenceCommit
For the rest of the demo we define the source and target commitish:
sourceCommitish := workingCopy.
targetCommitish := workingCopy referenceCommit
sourceCommitish := workingCopy.
targetCommitish := workingCopy referenceCommit ancestors first
sourceCommitish := workingCopy referenceCommit.
targetCommitish := workingCopy referenceCommit ancestors first
sourceCommitish diffTo: targetCommitish
To compute a diff: (from the class comment):
- Asking to the repository the list of changed files/packages between the two versions. These are obtained, for example, by the Monticello dirty flags and the list of modified files provided by Git.
- The first step in computing a diff is to detect the type of changes between the source and the destination. This is a high level change, subclassing IceChange
Object subclass: #IceChange
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-Changes'
, indicating the type of entity that changed (package), and where the changed occured (git or image):
- IceImageChange
IceChange subclass: #IceImageChange
instanceVariableNames: 'package'
classVariableNames: ''
package: 'Iceberg-Changes'
: a change coming from the image (in contrast to a change coming from git)
- IceGitChange
IceChange subclass: #IceGitChange
instanceVariableNames: 'filePathString'
classVariableNames: ''
package: 'Iceberg-Libgit-Changes'
: a change coming from git (in contrast to a change coming from the image)
- IceProjectChange
IceChange subclass: #IceProjectChange
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-Project'
: the fact that the project changed
- IceCypressPropertiesChange
IceChange subclass: #IceCypressPropertiesChange
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-Changes'
changes := sourceCommitish changesTo: targetCommitish
- Based on these high-level changes, the diff calculates two trees of IceDefinition
Object subclass: #IceDefinition
instanceVariableNames: 'name'
classVariableNames: ''
package: 'Iceberg-Changes'
. Those trees are represented as compositions of IceNode
IceAbstractNode subclass: #IceNode
instanceVariableNames: 'parent childrenDictionary value'
classVariableNames: ''
package: 'Iceberg-Changes'
. These definitions are the logical entities at the level of the code model.
Diff between the working copy and a commit
When commiting a diff is made betwen the working copy and the reference commit. The same mechanism could be used to get a diff between the working copy and another commit.
Diff between two commits
One type of diff is between two commits. In this case Iceberg performs the diff at the file level.
Every file that changes is modeled as a IceGitChange
IceChange subclass: #IceGitChange
instanceVariableNames: 'filePathString'
classVariableNames: ''
package: 'Iceberg-Libgit-Changes'
. The importer IceChangeImporter
Object subclass: #IceChangeImporter
instanceVariableNames: 'parentNode diff version'
classVariableNames: ''
package: 'Iceberg-Changes'
has a dedicated #selector
visitGitChange: anIceGitChange
| importer |
importer := IceGitChangeImporter new
path: anIceGitChange path;
diff: diff;
version: version;
yourself.
importer importOn: parentNode.
that uses a dedicated IceGitChangeImporter
Object subclass: #IceGitChangeImporter
instanceVariableNames: 'path diff version'
classVariableNames: ''
package: 'Iceberg-Libgit-Changes'
to create nodes with the appropriate definitions.
The IceGitChangeImporter
Object subclass: #IceGitChangeImporter
instanceVariableNames: 'path diff version'
classVariableNames: ''
package: 'Iceberg-Libgit-Changes'
creates IceDirectoryDefinition
IceFileSystemDefinition subclass: #IceDirectoryDefinition
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-Changes'
and IceFileDefinition
IceFileSystemDefinition subclass: #IceFileDefinition
instanceVariableNames: 'contents'
classVariableNames: ''
package: 'Iceberg-Changes'
for all normal files, until it encounters a older that contains a package. In that case it creates a MCSnapshot
Object subclass: #MCSnapshot
instanceVariableNames: 'definitions classDefinitionCache'
classVariableNames: ''
package: 'Monticello-Base'
from the version of that package inside the commit and uses a IceMCPackageImporter
Object subclass: #IceMCPackageImporter
instanceVariableNames: 'version package'
classVariableNames: ''
package: 'Iceberg-Changes'
to create Iceberg definitions for the content of that package. These definitions are created using IceMCDefinitionImporter
Object subclass: #IceMCDefinitionImporter
instanceVariableNames: 'packageNode snapshot'
classVariableNames: ''
package: 'Iceberg-Changes'
.
To determine the list of IceGitChange
IceChange subclass: #IceGitChange
instanceVariableNames: 'filePathString'
classVariableNames: ''
package: 'Iceberg-Libgit-Changes'
between two version Iceberg gets the tree of files in each commit and the diff between them using LibGit.
icebergRepository
changedFilesBetween: sourceCommitish and: targetCommitish.
fromTree := (LGitCommit
of: icebergRepository repositoryHandle
fromHexString: sourceCommitish id) tree.
toTree := (LGitCommit
of: icebergRepository repositoryHandle
fromHexString: targetCommitish id) tree.
gitTreeDiff := fromTree diffTo: toTree.
gitTreeDiff files collect: [ :each | IceGitChange on: each ]
Steps for performing the diff
diff := IceDiff new
sourceVersion: sourceCommitish;
targetVersion: targetCommitish;
yourself
leftTree := IceNode value: IceRootDefinition new.
changes do: [ :aChange |
aChange accept: (IceChangeImporter new
version: sourceCommitish;
diff: diff;
parentNode: leftTree;
yourself) ].
leftTree
rightTree := IceNode value: IceRootDefinition new.
changes do: [ :change |
change accept: (IceChangeImporter new
version: targetCommitish;
diff: diff;
parentNode: rightTree;
yourself) ].
rightTree
- Then, the two trees are diff'd (IceDiff>>#diff:with:
diff: leftTree with: rightTree
^ (self mergedTreeOf: leftTree with: rightTree)
select: [ :operation | operation hasChanges ]
), and a tree of differences is obtained. This tree is also a composition of IceNode
IceAbstractNode subclass: #IceNode
instanceVariableNames: 'parent childrenDictionary value'
classVariableNames: ''
package: 'Iceberg-Changes'
s, but contains IceOperation
Object subclass: #IceOperation
instanceVariableNames: 'definition'
classVariableNames: ''
package: 'Iceberg-Changes'
objects instead (additions, deletions and modifications).
mergedTree := diff mergedTreeOf: leftTree with: rightTree.
tree := mergedTree select: [ :operation | operation hasChanges ].
Commiting changes
The main entry point for performing a commit is IceWorkingCopy>>#commitChanges:withMessage:force:
commitChanges: aDiff withMessage: message force: forcing
"Creates a commit with the given changes using the comment given as argument.
The forcing parameter allows to create an empty commit. This is used by the merge.
NOTICE that commits can only be done if the following is true:
- HEAD is a branch
- the working copy reference commit is the same commit as #headCommit"
| newCommit |
self validateCanCommit.
self repository index
updateDiskWorkingCopy: aDiff;
updateIndex: aDiff.
(forcing not and: [repository index isEmpty])
ifTrue: [ IceNothingToCommit signal ].
newCommit := self repository
commitIndexWithMessage: message
andParents: (self workingCopyState referenceCommits reject: [ :each | each isNoCommit ]).
^ newCommit
. This:
- writes changes to the in-image git index. Code is written to the index only when comitting, not when the user is typing the code or saving the image.
- performs a commit with the changes in the index
This both writes the actual code changes to the on-disk git index and goes the commit
fullDiff := IceDiff
from: sourceCommitish
to: targetCommitish
To perform a commit code changes are written by Iceberg directly to the in-image git index. In the image this is an instance of IceGitIndex
IceIndex subclass: #IceGitIndex
instanceVariableNames: 'modifiedFilePaths'
classVariableNames: ''
package: 'Iceberg-Libgit-Core'
. This maintains a list of modified file paths and can write them to the git index
IceGitIndex>>#updateDiskWorkingCopy:
updateDiskWorkingCopy: anIceDiff
anIceDiff tree
accept:
(IceGitWorkingCopyUpdateVisitor new
repository: repository;
index: self;
diff: anIceDiff)
uses a IceGitWorkingCopyUpdateVisitor
IceTreeVisitor subclass: #IceGitWorkingCopyUpdateVisitor
instanceVariableNames: 'repository diff index'
classVariableNames: ''
package: 'Iceberg-Libgit-Commit'
to write the code changes to disk, without changing the in-image index.
icebergRepository index updateDiskWorkingCopy: fullDiff
IceIndex>>#updateIndex:
updateIndex: anIceDiff
anIceDiff tree
accept: (IceIndexUpdateVisitor new
index: self;
diff: anIceDiff).
adds the changed locations to the in-image git index.
icebergRepository index updateIndex: fullDiff
After changes are written to disk and the in-image git index is updates a commit can be done. This is in the method IceRepository>>#commitIndexWithMessage:andParents:
commitIndexWithMessage: message andParents: parentCommitishList
"Low level.
Commit what is saved in the index"
| newCommit |
newCommit := index commitWithMessage: message andParents: parentCommitishList.
index := self newIndex.
self workingCopy referenceCommit: newCommit.
self workingCopy refreshDirtyPackages.
^ newCommit
newCommit := icebergRepository
commitIndexWithMessage: 'Example commit'
andParents: (workingCopy workingCopyState
referenceCommits reject: [ :each | each isNoCommit ]).
First changes are written to disk using IceGitIndex>>#addToGitIndex
addToGitIndex
repository addFilesToIndex: modifiedFilePaths.
Second a new empty index is created and installed in the repository
Third the state of the working copy and of all packages is updated based on the new commit
Merging
The merge between two versions is implemented by IceMerge
Object subclass: #IceMerge
instanceVariableNames: 'mergeTree repository mergeCommit imageCommit changesToWorkingCopyTree'
classVariableNames: ''
package: 'Iceberg-Changes'
. This computes a merge tree with the changes that should be applied during the merge. The tree contain as nodes IceNode
IceAbstractNode subclass: #IceNode
instanceVariableNames: 'parent childrenDictionary value'
classVariableNames: ''
package: 'Iceberg-Changes'
objects that have as values subclasses of IceOperationMerge
Object subclass: #IceOperationMerge
instanceVariableNames: 'chosen'
classVariableNames: ''
package: 'Iceberg-Changes'
. There are only two such types of operations:
- IceConflictingOperation
IceOperationMerge subclass: #IceConflictingOperation
instanceVariableNames: 'leftOperation rightOperation'
classVariableNames: ''
package: 'Iceberg-Changes'
: a conflict between two operations that can be solved by using #selector
selectLeft
chosen := leftOperation
and #selector
selectRight
chosen := rightOperation
.
- IceNonConflictingOperation
IceOperationMerge subclass: #IceNonConflictingOperation
instanceVariableNames: 'operation'
classVariableNames: ''
package: 'Iceberg-Changes'
: a non-conflict between two operations that can be solved automatically. The user can still override the automatic choice using #selectLeft and #selectRight.
otherBranch := icebergRepository branchNamed: 'release'.
mergeAction := IceMerge new
repository: icebergRepository;
mergeCommit: otherBranch commit;
yourself
The commit in case of merge uses the same logic as a normal user commit. Also the method IceWorkingCopy>>#commitChanges:withMessage:force:
commitChanges: aDiff withMessage: message force: forcing
"Creates a commit with the given changes using the comment given as argument.
The forcing parameter allows to create an empty commit. This is used by the merge.
NOTICE that commits can only be done if the following is true:
- HEAD is a branch
- the working copy reference commit is the same commit as #headCommit"
| newCommit |
self validateCanCommit.
self repository index
updateDiskWorkingCopy: aDiff;
updateIndex: aDiff.
(forcing not and: [repository index isEmpty])
ifTrue: [ IceNothingToCommit signal ].
newCommit := self repository
commitIndexWithMessage: message
andParents: (self workingCopyState referenceCommits reject: [ :each | each isNoCommit ]).
^ newCommit
is used. The difference is that the first parameter is now an instance of IceMerge
Object subclass: #IceMerge
instanceVariableNames: 'mergeTree repository mergeCommit imageCommit changesToWorkingCopyTree'
classVariableNames: ''
package: 'Iceberg-Changes'
instead of a IceDiff
Object subclass: #IceDiff
instanceVariableNames: 'tree source target writerClass mergedTree'
classVariableNames: ''
package: 'Iceberg-Changes'
. The commit logic can visit both normal IceOperation
Object subclass: #IceOperation
instanceVariableNames: 'definition'
classVariableNames: ''
package: 'Iceberg-Changes'
and IceOperationMerge
Object subclass: #IceOperationMerge
instanceVariableNames: 'chosen'
classVariableNames: ''
package: 'Iceberg-Changes'
(IceGitWorkingCopyUpdateVisitor
IceTreeVisitor subclass: #IceGitWorkingCopyUpdateVisitor
instanceVariableNames: 'repository diff index'
classVariableNames: ''
package: 'Iceberg-Libgit-Commit'
and IceIndexUpdateVisitor
IceTreeVisitor subclass: #IceIndexUpdateVisitor
instanceVariableNames: 'diff index'
classVariableNames: ''
package: 'Iceberg-Libgit-Commit'
)
Keeping the model up to data
Iceberg registers to system announcers in IceSystemEventListener>>#registerSystemAnnouncements
registerSystemAnnouncements
self unregisterSystemAnnouncements.
SystemAnnouncer uniqueInstance weak
when: ClassAnnouncement send: #handleClassChange: to: self;
when: MethodAnnouncement send: #handleMethodChange: to: self;
when: ClassTagAnnouncement send: #handlePackageChange: to: self;
when: MCVersionLoaderStopped send: #handleVersionLoaded: to: self.
and triggers a IceRepositoryModified
IceRepositoryAnnouncement subclass: #IceRepositoryModified
instanceVariableNames: ''
classVariableNames: ''
package: 'Iceberg-Announcements'
event for the iceberg repository that should be updated.
To detect which repository should be updated, Iceberg traverses the repositories looking for one that contains the changed package.
If the package is loaded into the image, it is marked as dirty in IceWorkingCopy>>#notifyPackageModified:
notifyPackageModified: aString
<gtPharoPatch: #Pharo>
self flag: #pharoTodo. "we cannot use #includesPackageNamed: as is because it can happen
that a package is present in a commit but not in image yet?"
self shouldIgnoreNotifications ifTrue: [ ^ false ].
(self includesInWorkingCopyPackageNamed: aString) ifTrue: [
| package |
package := self packageNamed: aString.
package isDirty ifFalse: [ package beDirty ].
^ true ].
^ false