Scaling Personalized Movie Recommendations: Lessons from the Swipe Stack
Personalized recommendations are at the heart of modern entertainment apps. But as your user base and content library grow, so do the challenges of serving fast, relevant, and scalable recommendations. In this post, we’ll walk through the evolution of a movie swiping API, highlight common pitfalls, and share best practices for scaling your backend—using real-world code and SQL.
The Problem: Excluding Already-Swiped Movies
Suppose you’re building a Tinder-like movie app. Each user can swipe left, right, or skip on movies. You want to:
- Show each user only movies they haven’t swiped on.
- Support filters (genre, year, etc.) and pagination.
- Keep the API fast, even as your database grows.
A naive approach might look like this:
// 1. Fetch all swiped movie IDs for the user
const { data: userSwipes } = await supabase.from("rec_swipes").select("tmdb_id").eq("user_id", user.id)
const swipedTmdbIds = userSwipes?.map((swipe) => swipe.tmdb_id) || []
// 2. Query movies, excluding those IDs
let moviesQuery = supabase.from("movies").select("*", { count: "exact" }).order("popularity", { ascending: false })
if (swipedTmdbIds.length > 0) {
moviesQuery = moviesQuery.not("tmdb_id", "in", `(${swipedTmdbIds.join(",")})`)
}
// ...add filters, pagination, etc.
This works for small datasets, but as your users and movies grow, it can become a bottleneck.
Why This Doesn’t Scale
- Large Exclusion Lists: If a user has swiped on thousands of movies, the
NOT INclause grows, making queries slower. - Full Table Scans: Even with pagination, the database must scan and filter the entire
moviestable before returning a page. - Unnecessary Columns: Using
select("*")fetches all columns, increasing bandwidth and memory usage.
How Big Players (like TMDB) Scale
- Precomputed Candidate Sets: Instead of filtering the whole table on every request, they periodically generate a pool of candidate movies for each user.
- Efficient Indexing: Indexes on filter columns (
tmdb_id,popularity, etc.) make lookups fast. - SQL Subqueries: Use
NOT EXISTSorLEFT JOINto exclude swiped movies directly in the database, not in application code. - Selective Columns: Only fetch the fields needed for the UI.
A More Scalable Approach
1. Only Select Needed Columns
let moviesQuery = supabase
.from("movies")
.select("tmdb_id, title, poster_path, popularity, genres, release_date", { count: "exact" })
.order("popularity", { ascending: false })
2. Use SQL NOT EXISTS (or JOIN) for Exclusion
SELECT tmdb_id, title, poster_path, popularity, genres, release_date
FROM movies m
WHERE NOT EXISTS (
SELECT 1 FROM rec_swipes s
WHERE s.user_id = :user_id AND s.tmdb_id = m.tmdb_id
)
ORDER BY m.popularity DESC
LIMIT :limit OFFSET :offset
This lets the database do the heavy lifting, and scales much better than passing a giant exclusion list from your app.
3. Indexing
Make sure you have indexes on rec_swipes.user_id, rec_swipes.tmdb_id, and any filter columns you use frequently.
4. Precompute When Needed
For very large datasets, consider precomputing a set of candidate movies for each user (e.g., nightly jobs, materialized views, or ML models). Serve from this pool and refresh it periodically.
When to Optimize
- Start simple: For most MVPs and early-stage products, the naive approach is fine.
- Monitor performance: As your user base and data grow, watch for slow queries and high database load.
- Refactor as needed: Move to more advanced SQL and precomputation when you see real bottlenecks.
Conclusion
Scaling personalized recommendations is a journey. Start with clear requirements, measure real-world performance, and evolve your backend as your product grows. By understanding the tradeoffs and using the right SQL patterns, you’ll be ready for the next wave of users—without breaking the bank or your app.
Further Reading:
