architecture

Embedding

An embedding is a numeric representation of text (or an image) in a high-dimensional space where similar content sits close together — the foundation of semantic search and RAG.

Also known as: Vektor-Repräsentation

In detail

Embeddings translate text into vectors (e.g. 1,536 numbers for OpenAI's text-embedding-3-small). The vectors are chosen so that cosine similarity between two vectors expresses how similar the original texts are.

That lets a computer quickly find the topically most relevant document chunks even when the wording doesn't match — 'How do I cancel?' also finds texts about contract termination or unsubscribing.

Example

We index your 12,000 service documents once as embeddings (~30 min). For every new request we retrieve the 5 most similar chunks in milliseconds.

Related terms

Multi-agent
Multi-agent refers to an architecture where several specialized AI agents collaborate — a router decides who takes over, and they hand off tasks cleanly.
RAG
RAG (Retrieval-Augmented Generation) is a technique where the language model retrieves relevant documents from your knowledge base before each answer and uses them as the basis for its response — so the agent stays current and answers with sources.
Tool use
Tool use (or function calling) is a language model's ability to call external functions with structured arguments — e.g. an API, a database query, or a skill.