Skip to main content

Scaling Personalized Movie Recommendations: Lessons from the Swipe Stack

· 3 min read
Mike
DAGGH Lead Developer

Personalized recommendations are at the heart of modern entertainment apps. But as your user base and content library grow, so do the challenges of serving fast, relevant, and scalable recommendations. In this post, we’ll walk through the evolution of a movie swiping API, highlight common pitfalls, and share best practices for scaling your backend—using real-world code and SQL.

The Problem: Excluding Already-Swiped Movies

Suppose you’re building a Tinder-like movie app. Each user can swipe left, right, or skip on movies. You want to:

  • Show each user only movies they haven’t swiped on.
  • Support filters (genre, year, etc.) and pagination.
  • Keep the API fast, even as your database grows.

A naive approach might look like this:

// 1. Fetch all swiped movie IDs for the user
const { data: userSwipes } = await supabase.from("rec_swipes").select("tmdb_id").eq("user_id", user.id)
const swipedTmdbIds = userSwipes?.map((swipe) => swipe.tmdb_id) || []

// 2. Query movies, excluding those IDs
let moviesQuery = supabase.from("movies").select("*", { count: "exact" }).order("popularity", { ascending: false })
if (swipedTmdbIds.length > 0) {
moviesQuery = moviesQuery.not("tmdb_id", "in", `(${swipedTmdbIds.join(",")})`)
}
// ...add filters, pagination, etc.

This works for small datasets, but as your users and movies grow, it can become a bottleneck.

Why This Doesn’t Scale

  • Large Exclusion Lists: If a user has swiped on thousands of movies, the NOT IN clause grows, making queries slower.
  • Full Table Scans: Even with pagination, the database must scan and filter the entire movies table before returning a page.
  • Unnecessary Columns: Using select("*") fetches all columns, increasing bandwidth and memory usage.

How Big Players (like TMDB) Scale

  • Precomputed Candidate Sets: Instead of filtering the whole table on every request, they periodically generate a pool of candidate movies for each user.
  • Efficient Indexing: Indexes on filter columns (tmdb_id, popularity, etc.) make lookups fast.
  • SQL Subqueries: Use NOT EXISTS or LEFT JOIN to exclude swiped movies directly in the database, not in application code.
  • Selective Columns: Only fetch the fields needed for the UI.

A More Scalable Approach

1. Only Select Needed Columns

let moviesQuery = supabase
.from("movies")
.select("tmdb_id, title, poster_path, popularity, genres, release_date", { count: "exact" })
.order("popularity", { ascending: false })

2. Use SQL NOT EXISTS (or JOIN) for Exclusion

SELECT tmdb_id, title, poster_path, popularity, genres, release_date
FROM movies m
WHERE NOT EXISTS (
SELECT 1 FROM rec_swipes s
WHERE s.user_id = :user_id AND s.tmdb_id = m.tmdb_id
)
ORDER BY m.popularity DESC
LIMIT :limit OFFSET :offset

This lets the database do the heavy lifting, and scales much better than passing a giant exclusion list from your app.

3. Indexing

Make sure you have indexes on rec_swipes.user_id, rec_swipes.tmdb_id, and any filter columns you use frequently.

4. Precompute When Needed

For very large datasets, consider precomputing a set of candidate movies for each user (e.g., nightly jobs, materialized views, or ML models). Serve from this pool and refresh it periodically.

When to Optimize

  • Start simple: For most MVPs and early-stage products, the naive approach is fine.
  • Monitor performance: As your user base and data grow, watch for slow queries and high database load.
  • Refactor as needed: Move to more advanced SQL and precomputation when you see real bottlenecks.

Conclusion

Scaling personalized recommendations is a journey. Start with clear requirements, measure real-world performance, and evolve your backend as your product grows. By understanding the tradeoffs and using the right SQL patterns, you’ll be ready for the next wave of users—without breaking the bank or your app.


Further Reading: