GtRrRunnerLogTimingAnalyzer Developer Guide

Overview

GtRrRunnerLogTimingAnalyzer Object << #GtRrRunnerLogTimingAnalyzer slots: { #filename . #signals . #timingStats . #currentRun }; tag: 'JobStats'; package: 'RemoteRunner' reads remote runner log files and extracts timing statistics by processing beacon signals. It records measurements from each log into GtRrRunnerLogTimingRun Object << #GtRrRunnerLogTimingRun slots: { #sourceFile . #collectedAt . #pointsByName }; tag: 'JobStats'; package: 'RemoteRunner' objects and aggregates them in GtRrRunnerLogTimingStats Object << #GtRrRunnerLogTimingStats slots: { #stats . #timingDataset }; tag: 'JobStats'; package: 'RemoteRunner' for analysis and persistence.

Architecture

- GtRrRunnerLogTimingStat Object << #GtRrRunnerLogTimingStat slots: { #name . #average . #standardDeviation . #coefficientOfVariation . #minimum . #maximum . #dataPointCount . #varianceSum }; tag: 'JobStats'; package: 'RemoteRunner' - Holds individual statistical data points and computes statistics (average, standard deviation, coefficient of variation) - GtRrRunnerLogTimingRun Object << #GtRrRunnerLogTimingRun slots: { #sourceFile . #collectedAt . #pointsByName }; tag: 'JobStats'; package: 'RemoteRunner' - Represents timing data for a single log file; stores a timestamp and points by statistic name - GtRrRunnerLogTimingDataset Object << #GtRrRunnerLogTimingDataset slots: { #runs }; tag: 'JobStats'; package: 'RemoteRunner' - Aggregates runs and supports merging/append of timing datasets - GtRrRunnerLogTimingStats Object << #GtRrRunnerLogTimingStats slots: { #stats . #timingDataset }; tag: 'JobStats'; package: 'RemoteRunner' - Wraps a dataset and exposes stats; can be persisted to/loaded from STON files - GtBeaconJsonLogReader Object << #GtBeaconJsonLogReader slots: { #filename . #signals . #strictMode }; tag: 'Analysis'; package: 'Beacon-Core-GT' - Reads log files and deserializes beacon signals - BeaconSignal Announcement << #BeaconSignal slots: { #timestamp . #processId . #properties }; tag: 'Signals'; package: 'Beacon-Core' - Base class for all signals; provides timestamp for timing calculations

Key Instance Variables

- filename - The log file to analyze (a FileReference) - signals - Cached collection of signals loaded from the log file (lazily loaded) - timingStats - The GtRrRunnerLogTimingStats Object << #GtRrRunnerLogTimingStats slots: { #stats . #timingDataset }; tag: 'JobStats'; package: 'RemoteRunner' instance being populated - currentRun - The GtRrRunnerLogTimingRun Object << #GtRrRunnerLogTimingRun slots: { #sourceFile . #collectedAt . #pointsByName }; tag: 'JobStats'; package: 'RemoteRunner' for the current file; statistic points are recorded on it

Analysis Flow

GtRrRunnerLogTimingAnalyzer>>#analyze analyze "Run all timing analysis and update the timing dataset. Answer the timing stats." self currentRun: (GtRrRunnerLogTimingRun forFile: self filename). self updateFirstTaskQueuedTimeStat. self updateAverageMessageQueueTimeStat. self updateAverageTaskTimeStat. self updateFirstWorkerRegistrationTimeStat. self currentRun collectedAt: self firstJobStartupSignal timestamp. self timingStats addRun: self currentRun. self timingStats computeAll. ^ self timingStats creates a run for the file, records statistics by calling update methods, stamps the run’s collectedAt using the first job startup signal, adds the run to the timing dataset, then computes all statistics. This method assumes a job startup signal exists; missing startup signals will raise an error when setting collectedAt.

How to Add a New Statistic

Adding a new statistic follows a consistent pattern with four steps:

Step 1: Create Signal Finder Method(s)

Search for existing signal finder methods first to avoid duplicating logic. If you need a new method, place it in the private - signals protocol. For examples of the expected structure, see GtRrRunnerLogTimingAnalyzer>>#firstJobStartupSignal firstJobStartupSignal "Find the first job startup signal in the log" ^ self signals detect: [ :each | each isKindOf: GtRrJobStartupSignal ] ifNone: [ nil ] and GtRrRunnerLogTimingAnalyzer>>#lastTaskCompletedSignal lastTaskCompletedSignal "Find the last task completed signal in the log" ^ self signals reverseDo: [ :each | ({ GtRrTaskCompletedAnnouncement. GtRrNextTaskCompletedSignal } anySatisfy: [ :signalClass | each isKindOf: signalClass ]) ifTrue: [ ^ each ] ]. ^ nil .

Step 2: Create Calculation Method

Create a method that calculates the statistic value and place it in the private - statistics protocol. Return nil if the required signals are not found. For examples of expected structure and error handling, see GtRrRunnerLogTimingAnalyzer>>#calculateFirstTaskQueuedTime calculateFirstTaskQueuedTime "Calculate the time from job startup to first task queued. Answer the duration in milliseconds, or nil if signals not found." | startSignal queuedSignal duration | startSignal := self firstJobStartupSignal. queuedSignal := self firstTaskQueuedSignal. (startSignal isNil or: [ queuedSignal isNil ]) ifTrue: [ ^ nil ]. duration := (queuedSignal timestamp - startSignal timestamp) asNanoSeconds / 1000000.0. ^ duration and GtRrRunnerLogTimingAnalyzer>>#calculateAverageTaskTime calculateAverageTaskTime "Calculate the average task time. This is the time from first task started signal until last task completed signal, divided by the number of tasks completed. Answer the duration, or nil if signals not found." | firstSignal lastSignal taskCount totalDuration | firstSignal := self firstTaskStartedSignal. lastSignal := self lastTaskCompletedSignal. taskCount := self numberOfCompletedTasks. (firstSignal isNil or: [ lastSignal isNil or: [ taskCount = 0 ] ]) ifTrue: [ ^ nil ]. totalDuration := (lastSignal timestamp - firstSignal timestamp) asNanoSeconds / 1000000.0. ^ totalDuration / taskCount .

Step 3: Create Update Method

Create a method that records the calculated value using GtRrRunnerLogTimingAnalyzer>>#recordPoint:named: recordPoint: aNumber named: aStatName self currentRun ifNil: [ ^ self ]. self currentRun addPoint: aNumber forStatisticNamed: aStatName and place it in the statistics protocol. For examples of expected structure and error handling, see GtRrRunnerLogTimingAnalyzer>>#updateFirstTaskQueuedTimeStat updateFirstTaskQueuedTimeStat "Calculate and add the first task queued time statistic" | statName value | statName := 'First task queued time'. value := self calculateFirstTaskQueuedTime. value ifNil: [ ^ self ]. self recordPoint: value named: statName and GtRrRunnerLogTimingAnalyzer>>#updateAverageTaskTimeStat updateAverageTaskTimeStat "Calculate and add the average task time statistic" | statName value | statName := 'Average task time'. value := self calculateAverageTaskTime. value ifNil: [ ^ self ]. self recordPoint: value named: statName .

Step 4: Register in analyze Method

Register the new update method in GtRrRunnerLogTimingAnalyzer>>#analyze analyze "Run all timing analysis and update the timing dataset. Answer the timing stats." self currentRun: (GtRrRunnerLogTimingRun forFile: self filename). self updateFirstTaskQueuedTimeStat. self updateAverageMessageQueueTimeStat. self updateAverageTaskTimeStat. self updateFirstWorkerRegistrationTimeStat. self currentRun collectedAt: self firstJobStartupSignal timestamp. self timingStats addRun: self currentRun. self timingStats computeAll. ^ self timingStats alongside the existing updates.

Current Statistics

- First task queued time: job startup → first queued task - Average message queue time: first queued task → last queued task, divided by queued message count - Average task time: first task started → last task completed, divided by completed task count - First worker registration: job startup → first worker registration

Common Signal Classes

When adding statistics, you will likely work with signals that are subclasses of BeaconSignal Announcement << #BeaconSignal slots: { #timestamp . #processId . #properties }; tag: 'Signals'; package: 'Beacon-Core' . Key properties available on all signals: - timestamp - A DateAndTime when the signal was created - processId - The process ID that created the signal Signals currently used by the analyzer include: - GtRrJobStartupSignal GtRrJobStartupAndShutdownSignal << #GtRrJobStartupSignal slots: {}; tag: 'Signals'; package: 'RemoteRunner' (job start) - GtRrMqQueueTaskStartingSignal GtRrMqQueueTaskSignal << #GtRrMqQueueTaskStartingSignal slots: {}; tag: 'Signals'; package: 'RemoteRunner' (task queued) - GtRrMqQueueTaskCompletedSignal GtRrMqQueueTaskSignal << #GtRrMqQueueTaskCompletedSignal slots: {}; tag: 'Signals'; package: 'RemoteRunner' (queue task completed; retry filtered) - GtRrTaskCompletedAnnouncement GtRrTaskFinishedAnnouncement << #GtRrTaskCompletedAnnouncement slots: {}; tag: 'Announcements'; package: 'RemoteRunner' and GtRrNextTaskCompletedSignal GtRrNextTaskSignal << #GtRrNextTaskCompletedSignal slots: {}; tag: 'Signals'; package: 'RemoteRunner' (task completed) - GtRrWorkerRegisteredAnnouncement GtRrWorkerAnnouncement << #GtRrWorkerRegisteredAnnouncement slots: { #workerCount }; tag: 'Announcements'; package: 'RemoteRunner' (worker registration) To find available signal classes for RemoteRunner, search for subclasses of relevant base signal classes or search for classes with names starting with GtRr.

Testing The Statistics

GtRrRunnerLogTimingAnalyzerExamples>>#analyzeAllSampleLogs analyzeAllSampleLogs <gtExample> | memoryRoot logFiles analyzer timingStats | "Get the memory file system with extracted log files" memoryRoot := self class sampleLogsInMemoryFileSystem. logFiles := memoryRoot children select: [ :each | each extension = 'log' ]. "Analyze all log files" analyzer := GtRrRunnerLogTimingAnalyzer analyzeFiles: logFiles. timingStats := analyzer timingStats. "Verify that timing stats were collected" self assert: timingStats notNil. "Verify other statistics exist and have data points" self verifyStatistic: 'First task queued time' in: timingStats expected: { 3. 37.0. 0.0. 59.0. 26.3. }. self verifyStatistic: 'First worker registration' in: timingStats expected: { 3. 62748.0. 61297.0. 64472.0. 1310.5. }. self verifyStatistic: 'Average message queue time' in: timingStats expected: { 3. 10613.0. 63.0. 16367.0. 7470.2. }. self verifyStatistic: 'Average task time' in: timingStats expected: { 3. 626463.3. 137193.0. 975752.0. 356362.6. }. ^ analyzer demonstrates loading logs from a file and verifying that they are correct.

Best Practices

1. Naming: Use descriptive, human-readable names for statistics (e.g., 'First task queued time' not 'ftqt') 2. Null safety: Always check for nil signals before calculating; return nil from calculation methods if data is unavailable 3. Units: Document the units in method comments; prefer milliseconds for consistency 4. Protocol organization: - private - signals for signal finder methods - private - statistics for calculation methods - statistics for update methods - api for public entry points 5. Reusability: If multiple statistics need the same signal, create a single finder method and reuse it