WIP

Recommender Systems

A comprehensive deep dive into building production-scale recommender systems using graph databases, covering architecture design, data engineering, and multiple algorithmic approaches.

The complete implementation, including matrix factorisation variants and graph integration patterns, is available in the Steam Recommender Systems repository.

todo make public and add link

Articles in this Series

Building a Multi-Modal Recommender System

Architecture Overview

Complete system architecture for serving 15+ different recommendation algorithms through a unified API. Covers microservices design, Docker orchestration, and the decision framework for choosing between content-based filtering, collaborative filtering, FastRP embeddings, and deep learning approaches.

Key Topics: System architecture, microservices, algorithm comparison, production deployment

Graph Database Design for Recommender Systems

Schema Design & Performance

In-depth exploration of Neo4j schema design patterns that enable multiple recommendation algorithms to coexist. Covers performance optimisation, constraint strategies, and schema evolution patterns for production systems.

Key Topics: Database design, schema patterns, performance tuning, graph algorithms

Data Import Strategies | Neo4j Driver vs. Admin Bulk Import

Data Engineering & ETL

Comprehensive comparison of transaction-based imports versus bulk CSV imports for Neo4j. Includes performance benchmarks, memory optimisation strategies, and decision frameworks for different data loading scenarios.

Key Topics: Data import, ETL pipelines, performance benchmarking, batch processing

Content-Based Recommendations | From Features to Vectors

Feature Engineering & Vector Similarity

Advanced feature engineering techniques for content-based filtering using graph structures. Covers sparse vector creation, Neo4j GDS integration, and multiple recommendation strategies with real-world performance metrics.

Key Topics: Feature engineering, vector similarity, content filtering, explainable AI

Core Technologies

  • Neo4j & GDS: Graph database and data science library for scalable graph algorithms
  • FastAPI: High-performance API serving layer
  • Docker: Containerised microservices architecture
  • PostgreSQL: Raw data storage and ETL source
  • PyTorch: Deep learning models and two-tower architectures
  • Python: Data processing with pandas/polars

System Capabilities

  • Multi-Algorithm Support: 15+ different recommendation approaches
  • Real-Time Serving: Sub-100ms recommendation response times
  • Massive Scale: 50M+ user-game interactions, 200K+ active users
  • Explainable Results: Clear reasoning for content-based recommendations
  • Production Ready: Docker orchestration, monitoring, and A/B testing capabilities

Dataset

All implementations use Steam gaming data featuring:

  • 2M+ unique games with rich metadata
  • Complex user interaction patterns
  • Social relationships and group memberships
  • Multi-modal features (genres, developers, player counts)

This series demonstrates how to move from recommendation research to production recommendation systems that scale.