Embedding
Quick Answer
An embedding is a numerical vector representation of text, images or other data that captures semantic meaning. Items with similar meaning produce similar vectors, which makes embeddings the backbone of semantic search, recommendations and RAG systems.
In Depth
What Embedding really means
An embedding typically has a few hundred to a few thousand dimensions. Cosine similarity or dot product between two embeddings gives a useful measure of how related the underlying items are, independent of their surface wording.
Embeddings are produced by dedicated models, which may be much smaller and cheaper than full LLMs. Choosing the right embedding model and keeping embeddings refreshed as content changes are important operational considerations.
Why It Matters
Business relevance for UK organisations
Embeddings power semantic search across knowledge bases, duplicate detection in CRM data, clustering of support tickets, and 'more like this' product recommendations on UK e-commerce sites.
Real-world example
How this shows up in practice
A Leeds e-commerce brand replaced its keyword search with embedding-based semantic search and saw a 22% uplift in search-to-purchase conversion.
Related Terms
Continue exploring
Vector Database
A vector database is a specialist datastore designed to index and query high-dimensional vectors, typically embeddings, at scale. It enables fast similarity search — retrieving the most semantically related items for a given query vector — across millions or billions of items.
TechnicalRAG (Retrieval Augmented Generation)
Retrieval Augmented Generation (RAG) is an architecture that combines a language model with an external knowledge source. Before generating an answer, the system retrieves relevant documents and feeds them to the model as context, dramatically reducing hallucinations and keeping answers current.
TechnicalLarge Language Model (LLM)
A Large Language Model (LLM) is a type of neural network trained on vast quantities of text to understand and generate human language. LLMs power chatbots, copilots, content generators and many modern AI features across consumer and business software.
AdvancedClustering
Clustering is an unsupervised learning technique that groups similar items together without any predefined labels. It is useful for discovering structure — customer segments, usage patterns, anomaly groups — that humans have not yet categorised.