Generating embeddings using Ollama models

We can use the Ollama client to generate embeddings.

Using the initialization code similar to Working with the Ollama API client:

model := GtLModelFactory new ollama_generate: 'nomic-embed-text-v2-moe'.
  

We can now use this to generate an embedding for text.

command := model newCreateEmbeddingsCommand.
command context input: 'A text to embed'.
embeddings := command perform.
  

We may also generate multiple embeddings at once.

command := model newCreateEmbeddingsCommand.
documents := {
  'Our refund policy lasts 30 days.'.
  'You can change your password from account settings.'.
  'Shipping usually takes 5 to 7 business days.'. }.
command context input: documents.
embeddings := command perform.
  

We can then for instance apply distance functions on them using GtLlmEmbeddingsUtilities>>#distancesFromEmbeddings:to: distancesFromEmbeddings: listOfEmbeddings to: anEmbedding ^ self distancesFromEmbeddings: listOfEmbeddings to: anEmbedding usingMetric: self defaultMetric .

command context input: 'How long does delivery take?'.
queryEmbeddings := command perform.
queryLlmEmbedding := GtLlmEmbedding new
	input: command context input;
	embedding: queryEmbeddings embeddings first.

llmEmbeddings := (1 to: embeddings input size) collect: [ :i |
	GtLlmEmbedding new
		input: (embeddings input at: i);
		embedding: (embeddings embeddings at: i) ].
distances := GtLlmEmbeddingsUtilities
		distancesFromEmbeddings: llmEmbeddings
		to: queryLlmEmbedding.
sorted := distances sorted: #distance ascending.
sorted first embedding input.
  

The default distance metric is cosine. To explore them, you can look at the distance metrics view on the GtLlmEmbeddingsUtilities Object << #GtLlmEmbeddingsUtilities slots: {}; package: 'Gt4Llm' class.

For a OpenAI-compliant option for generating embeddings, see Embeddings in OpenAI and Generating embeddings. For an in-image embedding registry that can act as an ad-hoc vector database, see Using the embedding registry.