Temperature
Quick Answer
Temperature is a parameter that controls how random or deterministic a language model's output is. Lower temperatures produce focused, predictable outputs; higher temperatures encourage diversity and creativity but increase the risk of hallucination or off-topic drift.
In Depth
What Temperature really means
Technically, temperature scales the probability distribution over next tokens. At temperature 0, the model always picks the most likely next token; at higher temperatures, less likely tokens become more probable, producing varied outputs.
Choosing the right temperature depends on the task. Extraction, classification and summarisation typically benefit from temperature 0 or close to it. Brainstorming, creative writing and open-ended ideation benefit from higher temperatures, typically 0.7 to 1.0.
Why It Matters
Business relevance for UK organisations
UK teams deploying LLMs for factual or compliance-sensitive tasks should default to low temperature and document their settings as part of model governance.
Real-world example
How this shows up in practice
A Birmingham insurer standardised all compliance-related LLM calls to temperature 0 and logged them for audit, dramatically improving consistency across its support team.
Related Terms
Continue exploring
Large Language Model (LLM)
A Large Language Model (LLM) is a type of neural network trained on vast quantities of text to understand and generate human language. LLMs power chatbots, copilots, content generators and many modern AI features across consumer and business software.
TechnicalPrompt Engineering
Prompt engineering is the practice of designing the text instructions given to a language model to produce reliable, accurate and appropriate outputs. Good prompts unlock significantly better performance without any change to the underlying model.
TechnicalHallucination
A hallucination is when an AI model produces output that sounds plausible but is factually incorrect, fabricated or inconsistent with its sources. Hallucinations are a fundamental property of current generative models and the single biggest risk in enterprise deployments.
BasicsInference
Inference is the phase in which a trained model is used to produce predictions or outputs on new data. While training happens once (or periodically), inference happens every time a user interacts with an AI system, making it the dominant cost in most production deployments.