TechnicalAI Glossary

Temperature

Quick Answer

Temperature is a parameter that controls how random or deterministic a language model's output is. Lower temperatures produce focused, predictable outputs; higher temperatures encourage diversity and creativity but increase the risk of hallucination or off-topic drift.

In Depth

What Temperature really means

Technically, temperature scales the probability distribution over next tokens. At temperature 0, the model always picks the most likely next token; at higher temperatures, less likely tokens become more probable, producing varied outputs.

Choosing the right temperature depends on the task. Extraction, classification and summarisation typically benefit from temperature 0 or close to it. Brainstorming, creative writing and open-ended ideation benefit from higher temperatures, typically 0.7 to 1.0.

Why It Matters

Business relevance for UK organisations

UK teams deploying LLMs for factual or compliance-sensitive tasks should default to low temperature and document their settings as part of model governance.

Real-world example

How this shows up in practice

A Birmingham insurer standardised all compliance-related LLM calls to temperature 0 and logged them for audit, dramatically improving consistency across its support team.

Related Terms

Continue exploring

Technical

Large Language Model (LLM)

A Large Language Model (LLM) is a type of neural network trained on vast quantities of text to understand and generate human language. LLMs power chatbots, copilots, content generators and many modern AI features across consumer and business software.

Technical

Prompt Engineering

Prompt engineering is the practice of designing the text instructions given to a language model to produce reliable, accurate and appropriate outputs. Good prompts unlock significantly better performance without any change to the underlying model.

Technical

Hallucination

A hallucination is when an AI model produces output that sounds plausible but is factually incorrect, fabricated or inconsistent with its sources. Hallucinations are a fundamental property of current generative models and the single biggest risk in enterprise deployments.

Basics

Inference

Inference is the phase in which a trained model is used to produce predictions or outputs on new data. While training happens once (or periodically), inference happens every time a user interacts with an AI system, making it the dominant cost in most production deployments.