LLM Retrieval Augmented Generation

GT includes a proof-of-concept implementation of a Vector DB that can be used to index the GT Book and provide RAG enhanced queries to an LLM.

Limitations

The Vector DB is stored on disk in a simple STON file (around 400MB serialised) and held in memory during use.

Generating the embeddings and querying the LLM both assume that Ollama is installed with the default embedding and reasoning model (see below).

The context length of the running model needs to be significantly larger than the default explanation length (listed below). Note that the running context length in Ollama is not the same as the maximum context length size of the underlying model. The running context length may be viewed after the model has been loaded in to Ollama with:

provider := GtLConnection new
	providerClass: GtLOllamaProvider;
	modelName: 'qwen3.5:9b';
	buildBareProvider.
provider runningModels.

I.e. this is not intended to be a production-ready implementation.

Generate the Vector DB

This will regenerate the entire DB from scratch, and will likely take 15 minutes or more, depending on the hardware and models chosen, see below.

[ GtLGtBookExperimentalRag cleanUp.
gtRag := GtLExperimentalRag gtBook.
gtRag vectorDbFile ensureDelete.
"Requesting the DB has the side-effect of generating it when the file doesn't exist"
gtRag lepiterRagDb.
] forkAt: Processor userBackgroundPriority - 1.

Update the Vector DB

This will re-index any pages modified since the DB file modification timestamp.

gtRag := GtLExperimentalRag gtBook.
gtRag updateVectorDb.

Sample query using the Vector DB

"Ensure that the VectorDB has been loaded first, as it can be quite slow to load"
GtLExperimentalRag gtBook lepiterRagDb.

provider := GtLConnection new
	providerClass: GtLOllamaProvider;
	modelName: 'qwen3.5:9b';
	buildBareProvider.
chat := GtLChat new provider: provider.
chat
	queryMarkdown: 'How do I change the colour of text?'
	withRag: GtLExperimentalRag gtBook.

Indexing Strategy

The indexing strategy is described in GtLLepiterRagDb Object << #GtLLepiterRagDb slots: { #vectorDb . #splitter . #lepiterDbName }; tag: 'Support'; package: 'Gt4Llm-VectorDb' 's class comments.

The default embedding parameters are: