TechnicalAI Glossary

Token

Quick Answer

A token is the basic unit of text that a language model processes. Tokens are usually subword chunks — roughly four characters or three-quarters of a word in English — and both the size of the model's context window and its pricing are typically measured in tokens.

In Depth

What Token really means

Tokenisation splits input text into a sequence of tokens using a vocabulary learned during model training. The same sentence may produce different numbers of tokens in different models, which matters for cost and latency.

Understanding tokens is essential for building efficient prompts, estimating API bills, and staying within context-window limits. Long prompts, verbose system messages and uncompressed retrieved context all drive up token usage.

Why It Matters

Business relevance for UK organisations

UK finance teams should track token consumption as a first-class cost metric for any LLM-based product. Small prompt optimisations — shorter system messages, compact formats, caching — can deliver large savings at scale.

Real-world example

How this shows up in practice

A Glasgow SaaS startup cut its monthly LLM bill from £9,400 to £3,200 by compressing system prompts, caching common contexts and moving simple routing to a cheaper model.

Related Terms

Continue exploring

Technical

Large Language Model (LLM)

A Large Language Model (LLM) is a type of neural network trained on vast quantities of text to understand and generate human language. LLMs power chatbots, copilots, content generators and many modern AI features across consumer and business software.

Basics

Inference

Inference is the phase in which a trained model is used to produce predictions or outputs on new data. While training happens once (or periodically), inference happens every time a user interacts with an AI system, making it the dominant cost in most production deployments.

Technical

Prompt Engineering

Prompt engineering is the practice of designing the text instructions given to a language model to produce reliable, accurate and appropriate outputs. Good prompts unlock significantly better performance without any change to the underlying model.

Basics

Model

A model is the trained output of a machine learning process — a collection of learned parameters that, combined with an algorithm, can turn new inputs into predictions or generated content without being explicitly programmed for each case.