How do Recommendation Algorithms Work?
How Netflix knows what you want to watch, Spotify builds your playlists, and Amazon predicts what you'll buy — before you even know yourself.
7 min read
You open Netflix. Within seconds, it shows you a grid of movies and shows — and somehow, most of them look interesting to you. That's not luck.
Netflix's recommendation system drives 80% of what people watch. Spotify's Discover Weekly has generated over a billion playlists. Amazon attributes 35% of its revenue to recommendations.
Recommendation algorithms predict what you'll like based on what you and others have done before.
The two fundamental approaches
Every recommendation system is built on one (or both) of two ideas:
1. Collaborative filtering: "People like you liked this"
The logic: if you and I both loved The Office, Breaking Bad, and Better Call Saul — and I also loved Succession — there's a good chance you'll like Succession too.
You don't need to know why we have similar taste. You just need to notice that our patterns match.
┌─────────────────────────────────────────────────────────────┐ │ │ │ COLLABORATIVE FILTERING │ │ │ │ You watched: ✅ Breaking Bad ✅ The Office ✅ Fargo │ │ User #48291: ✅ Breaking Bad ✅ The Office ✅ Fargo │ │ │ │ User #48291 also loved: Succession, The Bear │ │ │ │ → Recommendation: "You might like Succession" 🎬 │ │ │ └─────────────────────────────────────────────────────────────┘
Strengths: Discovers surprising recommendations. You might get recommended a cooking show because thriller fans weirdly love cooking shows — the algorithm doesn't need to understand why.
Weakness: The cold start problem. New users have no history, and new items have no ratings. The system can't recommend what nobody's tried yet.
2. Content-based filtering: "This is similar to what you liked"
Instead of looking at other people, this approach looks at the items themselves. If you liked an action movie with a female lead set in space, it recommends other action movies with female leads set in space.
┌─────────────────────────────────────────────────────────────┐ │ │ │ CONTENT-BASED FILTERING │ │ │ │ You liked: "Arrival" │ │ │ │ Arrival features: │ │ Genre: Sci-fi ✓ Tone: Thoughtful ✓ │ │ Theme: Communication ✓ Pace: Slow burn ✓ │ │ │ │ Similar movies: │ │ Interstellar (sci-fi, thoughtful, slow burn) │ │ Annihilation (sci-fi, thoughtful, mysterious) │ │ Contact (sci-fi, communication, slow burn) │ │ │ └─────────────────────────────────────────────────────────────┘
Strengths: Works for new items immediately. Doesn't need other people's data. Transparent — you can explain why something was recommended.
Weakness: Never surprises you. If you only watch sci-fi, you only get recommended sci-fi. It creates a filter bubble.
How the big players do it
Netflix: The taste algorithm
Netflix doesn't just track what you watch. It tracks:
- What you watch completely vs. abandon
- When you pause, rewind, or fast-forward
- What time of day you watch
- What you watch after what
- How long you browse before picking something
- Which thumbnail made you click
They break content into thousands of "taste communities" — micro-genres like "Suspenseful Sci-Fi featuring a Strong Female Lead" or "Quirky British Comedies with Dry Humor."
Netflix's personalized thumbnails: Netflix shows different thumbnails for the same movie to different users. If you watch a lot of romantic films, you see a thumbnail emphasizing the romance angle. Action fan? You see the explosion scene. Same movie, different marketing — personalized by algorithm.
Spotify: Audio DNA + social signals
Spotify combines three approaches:
- Collaborative filtering: Users who listen to Band A and Band B also listen to Band C.
- Audio analysis: Actual analysis of the music — tempo, key, loudness, energy, danceability. Songs that sound similar get grouped.
- NLP on the internet: Spotify crawls blogs, reviews, and articles about music. If a blog post mentions two artists together, the algorithm notes the connection.
Your Discover Weekly playlist is all three working together.
Amazon: "Frequently bought together"
Amazon's system is almost ruthlessly effective:
- Item-to-item collaborative filtering: "People who bought X also bought Y"
- Purchase sequence prediction: "People who bought a camera usually buy a memory card within 2 weeks"
- Session-based recommendations: "You looked at 3 different laptop bags — here are 5 more"
Amazon famously files patents on "anticipatory shipping" — sending products to warehouses near you before you buy them, because they're that confident in their predictions.
The math behind it
At the heart of modern recommendation systems: embeddings.
Every user and every item gets converted into a vector — a list of numbers representing their characteristics. Users and items that are "compatible" end up close together in this mathematical space.
┌─────────────────────────────────────────────────────────────┐ │ │ │ EMBEDDING SPACE (simplified 2D) │ │ │ │ Artsy ▲ │ │ │ • You • Wes Anderson films │ │ │ • A24 movies │ │ │ • Indie dramas │ │ │ │ │ │ │ │ │ • Marvel movies │ │ │ • Michael Bay │ │ └───────────────────────────────────── Mainstream ► │ │ │ │ You're close to indie/A24 → recommend those │ │ │ └─────────────────────────────────────────────────────────────┘
The recommendation becomes a nearest-neighbor search: find the items closest to the user in embedding space.
The dark patterns
Recommendation algorithms aren't just helpful. They're designed to maximize engagement, and that creates problems:
Filter bubbles. You only see content that matches your existing preferences. Your worldview narrows. Politically, this can be polarizing.
Addictive loops. YouTube's recommendation algorithm was famously found to push users toward increasingly extreme content because extreme content gets more engagement.
Popularity bias. Popular items get recommended more, which makes them more popular, which gets them recommended more. New creators struggle to break through.
Manipulation. Knowing what you're likely to click means knowing how to manipulate you. The line between "helpful recommendation" and "manipulation" is blurry.
The YouTube rabbit hole: In 2019, researchers found that YouTube's algorithm could take a user from a mainstream news video to conspiracy content in just a few clicks. Each recommendation was slightly more extreme than the last, optimizing for watch time at each step.
YouTube has since changed their algorithm, but it illustrates the risk: optimizing for engagement isn't the same as optimizing for user wellbeing.
Building a recommendation system
If you're building one, here's the modern approach:
Stage 1: Candidate generation. From millions of items, narrow down to hundreds of potential recommendations using fast, approximate methods.
Stage 2: Ranking. Use a more sophisticated model to rank those hundreds. Consider context: time of day, device, recent activity, user history.
Stage 3: Re-ranking. Apply business rules. Ensure diversity (don't show 10 horror movies in a row). Remove content the user has already seen. Apply freshness boosts.
Stage 4: Serve and learn. Show the recommendations. Track what the user clicks, watches, skips. Feed that data back into the model.
Evaluation: How do you know if recommendations are good?
This is harder than it seems. Common metrics:
- Click-through rate: Did they click? (But clicking isn't the same as enjoying)
- Completion rate: Did they finish the movie? (Better signal)
- Diversity: Are recommendations varied enough?
- Serendipity: Are there pleasant surprises?
- User satisfaction: Survey data (the gold standard, but expensive)
Netflix learned this the hard way: optimizing purely for predicted ratings didn't work. Users said they liked documentaries, but they watched trashy reality TV. The algorithm had to learn to read behavior, not stated preferences.
The future
Recommendation systems are getting smarter:
- Multimodal understanding: Analyzing the actual content of videos, music, and images — not just metadata
- Conversational recommendations: "I want something like Inception but less confusing" → the system understands nuance
- Explainable recommendations: "We're recommending this because you watched 3 time-travel movies last week"
- Ethical constraints: Algorithms designed to avoid rabbit holes, promote diverse viewpoints, and respect user autonomy
The bottom line: Recommendation algorithms are the invisible curators of your digital life. They decide what you watch, listen to, buy, and read. They're powered by the same embedding and collaborative filtering techniques that drive much of modern AI. Understanding them matters — because they understand you better than you might think.
Recommendations rely heavily on understanding similarity. The math behind it: What are Embeddings?
Keep reading
What is RAG?
Retrieval-Augmented Generation gives AI access to external knowledge. Like having a research assistant who can look things up before answering.
6 min read
How do AI model versions work?
GPT-4, Claude 3.5, Gemini Ultra. What do these names mean? What actually changes between versions?
4 min read
What are AI tokens?
Token limits, context windows, and why AI charges you by the token. Everything you need to know about the currency of AI.
4 min read
Get new explanations in your inbox
Every Tuesday and Friday. No spam, just AI clarity.
Powered by AutoSend