Back to Glossary
TechnicalAI Glossary

Transformer

Quick Answer

The transformer is a neural network architecture introduced in 2017 that uses a mechanism called self-attention to process sequences in parallel. It is the foundational architecture behind nearly all modern large language models and many leading vision and audio models.

In Depth

What Transformer really means

Self-attention allows a transformer to weigh the importance of every token in the input when producing each token of the output, capturing long-range dependencies more effectively than older recurrent architectures.

Transformers scale remarkably well: bigger models trained on more data with more compute have, so far, continued to produce better results. This scaling property is the engine behind the generative AI boom.

Why It Matters

Business relevance for UK organisations

For most UK businesses, the practical implication of transformers is simply that capable language and vision models are now available as commodity services. The underlying architecture rarely needs to be understood in detail — but knowing what it enables shapes better procurement and strategy decisions.

Real-world example

How this shows up in practice

A Newcastle-based publisher uses a transformer-based model to summarise long-form articles into social-media snippets, cutting editorial time by 60%.