Clustering
Quick Answer
Clustering is an unsupervised learning technique that groups similar items together without any predefined labels. It is useful for discovering structure — customer segments, usage patterns, anomaly groups — that humans have not yet categorised.
In Depth
What Clustering really means
Common algorithms include k-means, hierarchical clustering, DBSCAN and HDBSCAN. Each makes different assumptions about cluster shape and density, so the right choice depends on the data and the business question being asked.
Clustering outputs must be validated. Statistical metrics such as silhouette score help, but ultimately a cluster is only useful if a human stakeholder can name it and use it to make a decision.
Why It Matters
Business relevance for UK organisations
UK marketers cluster customers into segments; operations teams cluster incidents to find root causes; security teams cluster events to detect coordinated threats.
Real-world example
How this shows up in practice
A London media group clustered 1.2m newsletter subscribers into seven behavioural segments, unlocking tailored content strategies that lifted open rates by 19%.
Related Terms
Continue exploring
Unsupervised Learning
Unsupervised learning is a machine learning approach where the model learns patterns and structure from unlabelled data. Rather than predicting a known target, it uncovers groupings, anomalies or compressed representations hidden in the data.
TechnicalEmbedding
An embedding is a numerical vector representation of text, images or other data that captures semantic meaning. Items with similar meaning produce similar vectors, which makes embeddings the backbone of semantic search, recommendations and RAG systems.
BusinessCustomer Intelligence
Customer intelligence is the practice of combining data from every customer touchpoint and applying analytics and AI to produce a clearer picture of who customers are, what they want, and how they are likely to behave next.
BasicsMachine Learning
Machine Learning (ML) is a subfield of AI in which systems learn patterns from historical data rather than following explicitly programmed rules, enabling them to make predictions or decisions on new, unseen data as conditions evolve.