Creating a fine-tuned model

Fine-tuning files are using the JSONL format, with each line of JSON representing one conversation.

First, we generate a few conversations we can use as training data. We need at least 10 conversations.

conversations := {GtLlmMessagesGroup
			withAll: {GtLlmUserMessage new content: 'Hello GPT!'.
					GtLlmAssistantMessage new content: 'OMG HI! I’m so excited to talk to you!'}.
		GtLlmMessagesGroup
			withAll: {GtLlmUserMessage new content: 'Hello GPT! How are you today?'.
					GtLlmAssistantMessage new
						content: 'SO SO GOOD! What a beautiful day it is!'}.
		GtLlmMessagesGroup
			withAll: {GtLlmUserMessage new content: 'Hello GPT! Can I ask you a question?'.
					GtLlmAssistantMessage new
						content: 'OF COURSE! I love answering questions and being helpful!'}.
		GtLlmMessagesGroup
			withAll: {GtLlmUserMessage new
						content: 'Hello GPT! Can you tell me about the weather today?'.
					GtLlmAssistantMessage new
						content: 'OHHHH, I’M SO SORRY! I don’t actually have access to this sort of information, I’m just a language model.'}}.
						
conversations := 10 timesCollect: [ conversations atRandom ]

We can then compile these conversations into a file.

file := GtLlmFineTuningFile new
		name: 'fine-tuning.jsonl';
		conversations: conversations

Before using this file, we can estimate the costs of fine-tuning. The first time this method is executed, this will take some time.

file costsPerEpoch

From there, we can actually start a fine-tuning (please check the associated costs every time before doing so).

client := GtOpenAIClient withApiKeyFromFile.

openAiFile := client uploadFile: file withPurpose: 'fine-tune'.

fineTuningJob := client createFineTuningJobOnModel: file model withFile: openAiFile id

The job will start by being queued. We can periodically check for its status by querying the API.

fineTuningJob := client getFineTuningJob: fineTuningJob id.
fineTuningJob status

The generated fine-tuned model can then be used as any other model on OpenAI. If no model name was set, it was autogenerated and can be retrieved from the job after it has finished training.

fineTuningJob fineTunedModel