TechnicalAI Glossary

Vector Database

Quick Answer

A vector database is a specialist datastore designed to index and query high-dimensional vectors, typically embeddings, at scale. It enables fast similarity search — retrieving the most semantically related items for a given query vector — across millions or billions of items.

In Depth

What Vector Database really means

Vector databases use approximate nearest-neighbour algorithms such as HNSW or IVF to return relevant results in milliseconds. They are a core component of modern RAG systems, recommender engines, and similarity-based deduplication pipelines.

Popular managed and self-hosted options each have different trade-offs in latency, cost, filtering capabilities and operational complexity. Many traditional relational and document databases now offer vector-search extensions as well.

Why It Matters

Business relevance for UK organisations

UK organisations building internal AI assistants, smart search over documents, or personalised product discovery almost always need a vector database as part of their stack.

Real-world example

How this shows up in practice

A Bristol media company indexed every article it had published over 20 years into a vector database, enabling editors to find related past coverage in under 200ms.

Related Terms

Continue exploring

Technical

Embedding

An embedding is a numerical vector representation of text, images or other data that captures semantic meaning. Items with similar meaning produce similar vectors, which makes embeddings the backbone of semantic search, recommendations and RAG systems.

Technical

RAG (Retrieval Augmented Generation)

Retrieval Augmented Generation (RAG) is an architecture that combines a language model with an external knowledge source. Before generating an answer, the system retrieves relevant documents and feeds them to the model as context, dramatically reducing hallucinations and keeping answers current.

Technical

Large Language Model (LLM)

A Large Language Model (LLM) is a type of neural network trained on vast quantities of text to understand and generate human language. LLMs power chatbots, copilots, content generators and many modern AI features across consumer and business software.

Basics

Inference

Inference is the phase in which a trained model is used to produce predictions or outputs on new data. While training happens once (or periodically), inference happens every time a user interacts with an AI system, making it the dominant cost in most production deployments.