models

Context window

The context window is the maximum amount of text (measured in tokens) a language model can process at once — typically 128k to 1M tokens with current models.

Also known as: Kontext-Fenster

In detail

The window includes system prompt, conversation history, retrieved RAG documents, and the current request. Bigger window = more context for better answers, but also higher cost and latency.

Rule of thumb: 1 token ≈ 0.75 English words. A 128k window holds roughly a 200-page book.

Related terms

Fine-tuning
Fine-tuning is the additional training of an already-trained language model on your own data — changes the model itself, unlike RAG.
LLM
A Large Language Model is a neural network trained on billions of texts that understands, generates and transforms language. GPT-4, Claude, and Mistral are examples.
AI agent
An AI agent is a program built on a language model that completes tasks on its own: it understands a request, plans steps, calls tools, and responds with a result instead of just text.