Iceberg model and inner workings

Details about the inner working of Iceberg to understand how it commits, merges and handles conflicts.

Iceberg contains multiple IceRepository Object << #IceRepository slots: { #name . #workingCopy . #index . #commitsInPackageCache }; sharedVariables: { #Registry . #RepositoryClass }; tag: 'Core'; package: 'Iceberg' . They are stored inside a registry.

icebergRepository := IceRepository  registry 
  detect: [ :each | each name = 'gtoolkit' ].
  

A repository has a IceWorkingCopy Object << #IceWorkingCopy slots: { #repository . #packages . #referenceCommit . #shouldIgnoreNotifications . #project . #properties }; tag: 'Core'; package: 'Iceberg' that manage the status of all code loaded from the repository. It keeps track of what is loaded into the image (packages, comits, etc).

workingCopy := icebergRepository workingCopy
  

The working copy maintains list of IcePackage Object << #IcePackage slots: { #package . #isDirty . #workingCopy }; tag: 'Core'; package: 'Iceberg' from the repo.

workingCopy packages
  

A working copy can also be in several states. The state is computed using IceWorkingCopy>>#workingCopyState workingCopyState "The working copy can be in different states depending on the repository and the package. It is the working copy state reponsibility to decide wether we can commit, if we are on a merge, and so on... The working copy state can be obtained through the message #workingCopyState. workingCopy workingCopyState. The working copy state is calculated every time that it is called. This is because the state of the repository can be modified from outside the system (e.g., the command line or another tool). In any case, calculating the working copy state is fast enough to be executed on-line even for big repositories such as Pharo's. The working copy state is calculated from the status of each of its packages. It was decided like this because it may happen that somebody downloads a package from different commits. If this situation changes in the future, this is a good point for simplification." "This method obtains the head commit once and sends it as argument as an optimization. This is because asking for the head commit is expensive. Check the commits of #packageState" referenceCommit isCollection ifTrue: [ ^ IceInMergeWorkingCopy repository: repository ]. referenceCommit isUnknownCommit ifTrue: [ ^ IceUnknownVersionWorkingCopy repository: repository ]. referenceCommit isNoCommit ifTrue: [ ^ IceEmptyWorkingCopy repository: repository ]. ^ IceAttachedSingleVersionWorkingCopy repository: repository .

workingCopy workingCopyState
  

States are subclasses of IceWorkingCopyState Object << #IceWorkingCopyState slots: { #repository }; tag: 'WorkingCopy'; package: 'Iceberg'

IceInMergeWorkingCopy IceWorkingCopyState << #IceInMergeWorkingCopy slots: {}; tag: 'WorkingCopy'; package: 'Iceberg' : Indicates that a merge is in progress

IceUnknownVersionWorkingCopy IceWorkingCopyState << #IceUnknownVersionWorkingCopy slots: {}; tag: 'WorkingCopy'; package: 'Iceberg' : indicates that the reference commit is an unknown commit.

IceEmptyWorkingCopy IceWorkingCopyState << #IceEmptyWorkingCopy slots: {}; tag: 'WorkingCopy'; package: 'Iceberg' : indicates that the referece commit is a IceNoCommit IceCommitish << #IceNoCommit slots: {}; tag: 'Core'; package: 'Iceberg'

IceAttachedSingleVersionWorkingCopy IceWorkingCopyState << #IceAttachedSingleVersionWorkingCopy slots: {}; tag: 'WorkingCopy'; package: 'Iceberg' : indicates that the reference commit is a valid commit.

Overall a repository can be in several states:

- Unknown commit: The head commit of the repository is IceUnknownCommit IceCommitish << #IceUnknownCommit slots: { #id . #datetime }; tag: 'Core'; package: 'Iceberg' . This can be a repository created as a placeholder. A fetch could be required to load the actual repository

- Detached Working Copy: The head commit in the repository is not the same as the head commit inside the image

- Detached HEAD: The head of the repisitory is a commit instead of a branch. When cloning a repository, if the latest version is cloned, head is set to a branch. If a particular commit is cloned, then the head points to that commit, and Iceberg considers the repository detached.

- No Project Found: Missing setting for configuring the location of the code and its format

-Not loaded: No code loaded

- Uncommited changes with outgoing or incomming commits.

Changes are committed bases on IceDiff Object << #IceDiff slots: { #tree . #source . #target . #writerClass . #mergedTree }; tag: 'Changes'; package: 'Iceberg' . A diff contains the changes between two Iceberg commitish. This can be a diff between two commits, or a diff between the current working copy and a commit.

workingCopy diffToReferenceCommit
  

For the rest of the demo we define the source and target commitish:

sourceCommitish := workingCopy.
targetCommitish := workingCopy referenceCommit 
  
sourceCommitish := workingCopy.
targetCommitish := workingCopy referenceCommit ancestors first
  
sourceCommitish := workingCopy referenceCommit.
targetCommitish := workingCopy referenceCommit ancestors first
  
sourceCommitish diffTo: targetCommitish
  

To compute a diff: (from the class comment):

- Asking to the repository the list of changed files/packages between the two versions. These are obtained, for example, by the Monticello dirty flags and the list of modified files provided by Git.

- The first step in computing a diff is to detect the type of changes between the source and the destination. This is a high level change, subclassing IceChange Object << #IceChange slots: {}; tag: 'Changes'; package: 'Iceberg' , indicating the type of entity that changed (package), and where the changed occured (git or image):

- IceImageChange IceChange << #IceImageChange slots: { #package }; tag: 'Changes'; package: 'Iceberg' : a change coming from the image (in contrast to a change coming from git)

- IceGitChange IceChange << #IceGitChange slots: { #filePathString }; tag: 'Changes'; package: 'Iceberg-Libgit' : a change coming from git (in contrast to a change coming from the image)

- IceProjectChange IceChange << #IceProjectChange slots: {}; tag: 'Project'; package: 'Iceberg' : the fact that the project changed

- IceCypressPropertiesChange IceChange << #IceCypressPropertiesChange slots: {}; tag: 'Changes'; package: 'Iceberg'

changes := sourceCommitish changesTo: targetCommitish
  

- Based on these high-level changes, the diff calculates two trees of IceDefinition Object << #IceDefinition slots: { #name }; tag: 'Changes'; package: 'Iceberg' . Those trees are represented as compositions of IceNode IceAbstractNode << #IceNode slots: { #parent . #childrenDictionary . #value }; tag: 'Changes'; package: 'Iceberg' . These definitions are the logical entities at the level of the code model.

When commiting a diff is made betwen the working copy and the reference commit. The same mechanism could be used to get a diff between the working copy and another commit.

One type of diff is between two commits. In this case Iceberg performs the diff at the file level.

Every file that changes is modeled as a IceGitChange IceChange << #IceGitChange slots: { #filePathString }; tag: 'Changes'; package: 'Iceberg-Libgit' . The importer IceChangeImporter Object << #IceChangeImporter slots: { #parentNode . #diff . #version }; tag: 'Changes'; package: 'Iceberg' has a dedicated #selector visitGitChange: anIceGitChange | importer | importer := IceGitChangeImporter new path: anIceGitChange path; diff: diff; version: version; yourself. importer importOn: parentNode. that uses a dedicated IceGitChangeImporter Object << #IceGitChangeImporter slots: { #path . #diff . #version }; tag: 'Changes'; package: 'Iceberg-Libgit' to create nodes with the appropriate definitions.

The IceGitChangeImporter Object << #IceGitChangeImporter slots: { #path . #diff . #version }; tag: 'Changes'; package: 'Iceberg-Libgit' creates IceDirectoryDefinition IceFileSystemDefinition << #IceDirectoryDefinition slots: {}; tag: 'Changes'; package: 'Iceberg' and IceFileDefinition IceFileSystemDefinition << #IceFileDefinition slots: { #contents }; tag: 'Changes'; package: 'Iceberg' for all normal files, until it encounters a older that contains a package. In that case it creates a MCSnapshot Object << #MCSnapshot slots: { #definitions . #classDefinitionCache }; tag: 'Base'; package: 'Monticello' from the version of that package inside the commit and uses a IceMCPackageImporter Object << #IceMCPackageImporter slots: { #version . #package }; tag: 'Changes'; package: 'Iceberg' to create Iceberg definitions for the content of that package. These definitions are created using IceMCDefinitionImporter Object << #IceMCDefinitionImporter slots: { #packageNode . #snapshot }; tag: 'Changes'; package: 'Iceberg' .

To determine the list of IceGitChange IceChange << #IceGitChange slots: { #filePathString }; tag: 'Changes'; package: 'Iceberg-Libgit' between two version Iceberg gets the tree of files in each commit and the diff between them using LibGit.

icebergRepository  
	changedFilesBetween: sourceCommitish and: targetCommitish.
  
fromTree := (LGitCommit 
	of: icebergRepository repositoryHandle 
	fromHexString: sourceCommitish id) tree.
  
toTree := (LGitCommit 
	of: icebergRepository repositoryHandle 
	fromHexString: targetCommitish id) tree.
  
gitTreeDiff := fromTree diffTo: toTree.
  
gitTreeDiff files collect: [ :each | IceGitChange on: each ]
  
diff := IceDiff new
	sourceVersion: sourceCommitish;
	targetVersion: targetCommitish;
	yourself
  
leftTree := IceNode value: IceRootDefinition new.
changes do: [ :aChange | 
	aChange accept: (IceChangeImporter new
		version: sourceCommitish;
		diff: diff;
		parentNode: leftTree;
		yourself) ].
leftTree
  
rightTree := IceNode value: IceRootDefinition new.
changes do: [ :change | 
	change accept: (IceChangeImporter new
		version: targetCommitish;
		diff: diff;
		parentNode: rightTree;
		yourself) ].
rightTree
  

- Then, the two trees are diff'd (IceDiff>>#diff:with: diff: leftTree with: rightTree ^ (self mergedTreeOf: leftTree with: rightTree) select: [ :operation | operation hasChanges ] ), and a tree of differences is obtained. This tree is also a composition of IceNode IceAbstractNode << #IceNode slots: { #parent . #childrenDictionary . #value }; tag: 'Changes'; package: 'Iceberg' s, but contains IceOperation Object << #IceOperation slots: { #definition }; tag: 'Changes'; package: 'Iceberg' objects instead (additions, deletions and modifications).

mergedTree := diff mergedTreeOf: leftTree with: rightTree.
  
tree := mergedTree select: [ :operation | operation hasChanges ].
  

The main entry point for performing a commit is IceWorkingCopy>>#commitChanges:withMessage:force: commitChanges: aDiff withMessage: message force: forcing "Creates a commit with the given changes using the comment given as argument. The forcing parameter allows to create an empty commit. This is used by the merge. NOTICE that commits can only be done if the following is true: - HEAD is a branch - the working copy reference commit is the same commit as #headCommit" | newCommit | self validateCanCommit. self repository index updateDiskWorkingCopy: aDiff; updateIndex: aDiff. (forcing not and: [ repository index isEmpty ]) ifTrue: [ IceNothingToCommit signal ]. newCommit := self repository commitIndexWithMessage: message andParents: (self workingCopyState referenceCommits reject: [ :each | each isNoCommit ]). self referenceCommit: newCommit. self refreshDirtyPackages. ^ newCommit . This:

- writes changes to the in-image git index. Code is written to the index only when comitting, not when the user is typing the code or saving the image.

- performs a commit with the changes in the index

This both writes the actual code changes to the on-disk git index and goes the commit

fullDiff := IceDiff 
	from: sourceCommitish
	to: targetCommitish
  

To perform a commit code changes are written by Iceberg directly to the in-image git index. In the image this is an instance of IceGitIndex IceIndex << #IceGitIndex slots: { #modifiedFilePaths }; tag: 'Core'; package: 'Iceberg-Libgit' . This maintains a list of modified file paths and can write them to the git index

icebergRepository index
  

IceGitIndex>>#updateDiskWorkingCopy: updateDiskWorkingCopy: anIceDiff anIceDiff tree accept: (IceGitWorkingCopyUpdateVisitor new repository: repository; index: self; diff: anIceDiff) uses a IceGitWorkingCopyUpdateVisitor IceTreeVisitor << #IceGitWorkingCopyUpdateVisitor slots: { #repository . #diff . #index }; tag: 'Commit'; package: 'Iceberg-Libgit' to write the code changes to disk, without changing the in-image index.

icebergRepository index updateDiskWorkingCopy: fullDiff
  

IceIndex>>#updateIndex: updateIndex: anIceDiff anIceDiff tree accept: (IceIndexUpdateVisitor new index: self; diff: anIceDiff). adds the changed locations to the in-image git index.

icebergRepository index updateIndex: fullDiff
  

After changes are written to disk and the in-image git index is updates a commit can be done. This is in the method IceRepository>>#commitIndexWithMessage:andParents: commitIndexWithMessage: message andParents: parentCommitishList "Low level. Commit what is saved in the index" | newCommit | newCommit := index commitWithMessage: message andParents: parentCommitishList. index := self newIndex. ^ newCommit

newCommit := icebergRepository
	commitIndexWithMessage: 'Example commit'
	andParents: (workingCopy workingCopyState 
		referenceCommits reject: [ :each | each isNoCommit ]).
  

First changes are written to disk using IceGitIndex>>#addToGitIndex addToGitIndex repository addFilesToIndex: modifiedFilePaths.

Second a new empty index is created and installed in the repository

Third the state of the working copy and of all packages is updated based on the new commit

The merge between two versions is implemented by IceMerge Object << #IceMerge slots: { #mergeTree . #repository . #mergeCommit . #imageCommit . #changesToWorkingCopyTree }; tag: 'Changes'; package: 'Iceberg' . This computes a merge tree with the changes that should be applied during the merge. The tree contain as nodes IceNode IceAbstractNode << #IceNode slots: { #parent . #childrenDictionary . #value }; tag: 'Changes'; package: 'Iceberg' objects that have as values subclasses of IceOperationMerge Object << #IceOperationMerge slots: { #chosen }; tag: 'Changes'; package: 'Iceberg' . There are only two such types of operations:

- IceConflictingOperation IceOperationMerge << #IceConflictingOperation slots: { #leftOperation . #rightOperation }; tag: 'Changes'; package: 'Iceberg' : a conflict between two operations that can be solved by using #selector selectLeft chosen := leftOperation and #selector selectRight chosen := rightOperation .

- IceNonConflictingOperation IceOperationMerge << #IceNonConflictingOperation slots: { #operation }; tag: 'Changes'; package: 'Iceberg' : a non-conflict between two operations that can be solved automatically. The user can still override the automatic choice using #selectLeft and #selectRight.

otherBranch := icebergRepository branchNamed: 'release'.
  
mergeAction := IceMerge new
	repository: icebergRepository;
	mergeCommit: otherBranch commit;
	yourself
  

The commit in case of merge uses the same logic as a normal user commit. Also the method IceWorkingCopy>>#commitChanges:withMessage:force: commitChanges: aDiff withMessage: message force: forcing "Creates a commit with the given changes using the comment given as argument. The forcing parameter allows to create an empty commit. This is used by the merge. NOTICE that commits can only be done if the following is true: - HEAD is a branch - the working copy reference commit is the same commit as #headCommit" | newCommit | self validateCanCommit. self repository index updateDiskWorkingCopy: aDiff; updateIndex: aDiff. (forcing not and: [ repository index isEmpty ]) ifTrue: [ IceNothingToCommit signal ]. newCommit := self repository commitIndexWithMessage: message andParents: (self workingCopyState referenceCommits reject: [ :each | each isNoCommit ]). self referenceCommit: newCommit. self refreshDirtyPackages. ^ newCommit is used. The difference is that the first parameter is now an instance of IceMerge Object << #IceMerge slots: { #mergeTree . #repository . #mergeCommit . #imageCommit . #changesToWorkingCopyTree }; tag: 'Changes'; package: 'Iceberg' instead of a IceDiff Object << #IceDiff slots: { #tree . #source . #target . #writerClass . #mergedTree }; tag: 'Changes'; package: 'Iceberg' . The commit logic can visit both normal IceOperation Object << #IceOperation slots: { #definition }; tag: 'Changes'; package: 'Iceberg' and IceOperationMerge Object << #IceOperationMerge slots: { #chosen }; tag: 'Changes'; package: 'Iceberg' (IceGitWorkingCopyUpdateVisitor IceTreeVisitor << #IceGitWorkingCopyUpdateVisitor slots: { #repository . #diff . #index }; tag: 'Commit'; package: 'Iceberg-Libgit' and IceIndexUpdateVisitor IceTreeVisitor << #IceIndexUpdateVisitor slots: { #diff . #index }; tag: 'Commit'; package: 'Iceberg-Libgit' )

Iceberg registers to system announcers in IceSystemEventListener>>#registerSystemAnnouncements registerSystemAnnouncements self unregisterSystemAnnouncements. SystemAnnouncer uniqueInstance weak when: ClassAnnouncement send: #handleClassChange: to: self; when: MethodAnnouncement send: #handleMethodChange: to: self; when: ClassTagAnnouncement send: #handlePackageChange: to: self; when: MCVersionLoaderStopped send: #handleVersionLoaded: to: self. and triggers a IceRepositoryModified IceRepositoryAnnouncement << #IceRepositoryModified slots: {}; tag: 'Announcements'; package: 'Iceberg' event for the iceberg repository that should be updated.

To detect which repository should be updated, Iceberg traverses the repositories looking for one that contains the changed package.

If the package is loaded into the image, it is marked as dirty in IceWorkingCopy>>#notifyPackageModified: notifyPackageModified: aString <gtPharoPatch: #Pharo> self flag: #pharoTodo. "we cannot use #includesPackageNamed: as is because it can happen that a package is present in a commit but not in image yet?" self shouldIgnoreNotifications ifTrue: [ ^ false ]. (self includesInWorkingCopyPackageNamed: aString) ifTrue: [ | package | package := self packageNamed: aString. package isDirty ifFalse: [ package beDirty ]. ^ true ]. ^ false