Architecture Overview
Recommender systems power everything from Netflix to Spotify. But what happens when you need to compare fundamentally different approaches—content-based filtering, collaborative filtering, graph embeddings, and deep learning—all within a single production system?
This is the story of building a multi-modal Steam game recommender that serves 15+ different algorithms through a unified API, processing millions of user-game interactions in a graph database.
The Core Challenge: One System, Multiple Paradigms
Most recommendation tutorials focus on a single approach. Real production systems need to experiment with many. Our architecture handles this by creating a modular pipeline where each recommendation paradigm can coexist:
Raw Steam Data (PostgreSQL)
↓
ETL Pipeline (to_graph service)
↓
Neo4j Graph Database
↓
Model Training (model service)
↓
FastAPI Serving Layer (api service)
↓
REST Recommendations
The beauty lies in the separation of concerns. Each service handles one responsibility, yet they compose into a system that can serve content-based recommendations at 9am and deep learning embeddings at 9:01am.
Service Architecture: Microservices
The ETL Foundation
The pg_persistor
service handles raw Steam API ingestion. Nothing fancy—just reliable PostgreSQL storage for millions of user interactions, game metadata, and social connections.
The to_graph
service transforms this relational data into Neo4j’s graph format. The system offers two approaches, each optimised for different scenarios.
Transaction-Based Import (Live Data)
def transfer_data_batch(self, query_pg: str, query_neo4j: str, data_type: str):
for batch in pd.read_sql(query_pg, con=self.pg_connection, chunksize=10_000):
records = batch.to_dict(orient="records")
with self.graph.session() as session:
session.run(query_neo4j, records=records)
This approach streams data directly from PostgreSQL to Neo4j in 10K record batches. It’s perfect for incremental updates and smaller datasets where you need transaction guarantees.
Parquet-Based Import (Bulk Data)
def write_playtimes():
df = pl.read_parquet("Games_filtered.parquet")
for batch in df.iter_slices(n_rows=5_000):
records = batch.to_dicts()
with driver.session() as session:
session.run(playtime_query, records=records)
For initial loads of 50M+ records, the system bypasses PostgreSQL entirely. Raw data gets converted to Parquet files, then loaded using Polars for maximum throughput.
Why Two Approaches? Transaction-based import handles ongoing updates reliably but caps at ~10K records/second. Parquet import hits 50K+ records/second but requires preprocessing.
The Model Training Engine
The model
service is where algorithms come alive. Each recommendation approach gets its own module, but they all inherit from a common Model
base class:
class Model:
def _project(self) -> Graph:
"""Create GDS graph projection"""
return self.gds.graph.project(self.proj.graph_name,
self.proj.node_projection,
self.proj.relationship_projection)
def _write_sim_to_db(self, G: Graph):
"""Write similarities back to Neo4j"""
raise NotImplementedError
def run(self):
G = self._project()
self._write_sim_to_db(G)
self._post_clean()
This pattern means adding a new algorithm is as simple as implementing _write_sim_to_db()
. The projection, cleanup, and orchestration are handled automatically.
The API Layer: 15+ Algorithms in One Factory
The API service uses a factory pattern to serve any trained model:
class ModelFactory:
def __init__(self):
self._model = {}
def register_model(self, model_name: ModelName, model: Type[Recommendation]):
self._model[model_name] = model
def get_model(self, model_name: str) -> Recommendation:
return self._model[model_name]()
# Registration happens at startup
model_factory.register_model("apps_content_based_knn", RecGamesContentBasedKNN)
model_factory.register_model("apps_collaborative_weighted", RecGamesCollaborativeWeighted)
model_factory.register_model("apps_fastrp_direct", RecGamesFastRP)
Now any HTTP client can switch algorithms with a single parameter:
curl "localhost/recommendations/games?user_id=123&model=apps_fastrp_direct"
curl "localhost/recommendations/games?user_id=123&model=apps_content_based_knn"
Four Paradigms, One Graph
Content-Based: Features as Graph Nodes
┌─────────┐ ┌──────────┐ ┌─────────┐
│ User │────▶│ Game │────▶│ Feature │
│ Alice │ OWNS│ Cyberpunk│ HAS │ RPG │
└─────────┘ └──────────┘ └─────────┘
│ ▲
└───────────LIKES───────────────┘
(inherited from owned games)
Most content-based systems use feature matrices. We model features as graph nodes, creating richer representations:
def _prepare_features(self):
# Games connect to their features (genres, developers)
self.gds.run_cypher('''
MATCH (a:App)-[:HAS_GENRE]->(g:Genre)
MERGE (a)-[:FEATURE]->(g:Feature:Genre)
''')
# Users inherit features from owned games
self.gds.run_cypher('''
MATCH (u:User)-[:OWNS]->(a:App)-[:FEATURE]->(f:Feature)
WITH u, f, count(a) as weight
MERGE (u)-[:LIKES {weight: weight}]->(f)
''')
Now users and games exist in the same feature space. KNN similarity becomes a graph algorithm rather than a matrix operation.
Collaborative Filtering: Graph Algorithms at Scale
User-Item Bipartite Graph → Item-Item Similarities
┌───────┐ ┌──────┐ ┌───────┐
│ User1 │───▶│Game A│◀───│ User2 │
└───────┘ └──────┘owns└───────┘
│ ║ │
owns│ similar │owns
▼ ║ ▼
┌──────┐ ═══════╬═══════ ┌──────┐
│Game B│ ║ │Game C│
└──────┘ Jaccard └──────┘
Index
Traditional collaborative filtering computes user-user or item-item similarities in memory. Neo4j GDS can handle graphs with billions of relationships:
# Project user-item bipartite graph
G = gds.graph.project(
"user_item",
["User", "App"],
{"PLAYED": {"orientation": "UNDIRECTED"}}
)
# Compute item-item similarities using Jaccard
results = gds.nodeSimilarity.write(
G,
writeRelationshipType="ITEM_SIMILAR",
similarityCutoff=0.1,
topK=20
)
The algorithm runs in parallel across the entire graph, writing similarities back as new relationships. No matrices, no memory constraints.
FastRP: Universal Embeddings Across Entity Types
Multi-Entity Graph → Unified Embedding Space
┌──────┐ ┌──────┐ ┌───────┐ ┌────────┐
│ User │ │ Game │ │Friend │ │ Group │
└──────┘ └──────┘ └───────┘ └────────┘
│ │ │ │
└─────────┼──────────┼───────────┘
▼FastRP ▼
┌─────────────────────────────────────┐
│ 128-Dimensional Vector Space │
│ [0.2, -0.1, 0.5, ..., 0.3, -0.7] │
│ Cross-Type Similarities │
└─────────────────────────────────────┘
The breakthrough insight: users, games, groups, and friends can all be embedded in the same vector space using Fast Random Projection.
def fastrp(self, G) -> None:
results = self.gds.fastRP.mutate(
G,
embeddingDimension=128,
randomSeed=42,
mutateProperty="embedding",
iterationWeights=[1.0, 1.0, 1.0] # 3 iterations
)
This creates 128-dimensional embeddings for every node. Now you can recommend:
- Games similar to other games
- Users similar to other users
- Games similar to users (cross-type recommendations)
- Friends who like similar games
All from the same embedding space.
Deep Learning: Two-Tower Architecture
Two-Tower Neural Architecture
User Features Item Features
┌─────────────┐ ┌─────────────┐
│ Age: 25 │ │ Genre: RPG │
│ Country: US │ │ Price: $60 │
│ Playtime: H │ │ Rating: 9.1 │
└─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ User Tower │ │ Item Tower │
│ Neural │ │ Neural │
│ Network │ │ Network │
└─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│User Embedding│ │Item Embedding│
│ [64 dims] │ │ [64 dims] │
└─────────────┘ └─────────────┘
│ │
└──────────┬──────────┘
▼
Dot Product
(Recommendation Score)
The PyTorch service handles neural approaches. The two-tower architecture learns separate embeddings for users and items:
class TwoTowerModel(pl.LightningModule):
def __init__(self, config: ModelConfig):
super().__init__()
self.user_tower = FeatureLayer(config.user_features)
self.item_tower = FeatureLayer(config.item_features)
def forward(self, batch):
user_emb = self.user_tower(batch['user_features'])
item_emb = self.item_tower(batch['item_features'])
return torch.mm(user_emb, item_emb.t()) # Dot product
def training_step(self, batch, batch_idx):
scores = self(batch)
loss = F.binary_cross_entropy_with_logits(scores, batch['targets'])
return loss
The model trains on implicit feedback (playtime > 0 = positive, else negative) and can incorporate rich features from the graph database.
Docker Orchestration: 7 Services, One Command
The entire system runs with docker-compose up
:
services:
postgres: # Raw data storage
neo4j: # Graph database with GDS plugins
pg_persistor: # Data ingestion
to_graph: # ETL pipeline
model: # Algorithm training
api: # FastAPI serving
pytorch: # Deep learning experiments
Each service is independently scalable. Need more API throughput? Scale the API service. Training large embeddings? Scale the model service.
When to Use What: Decision Framework
Content-Based excels with:
- Cold start users (no interaction history)
- Explainable recommendations (“Because you liked RPGs”)
- Rich item metadata (genres, developers, tags)
Collaborative Filtering works best with:
- Dense interaction matrices
- Users with established preferences
- Implicit feedback signals (playtime, purchases)
FastRP Embeddings shine for:
- Cross-domain recommendations (games → friends → groups)
- Scalable similarity computation
- Multi-entity recommendation spaces
Deep Learning handles:
- Complex feature interactions
- Large-scale datasets (millions of users)
- Rich side information (user demographics, item features)
The system lets you A/B test these approaches against real user behaviour, not synthetic benchmarks.
Real-World Impact
This architecture powers recommendations for a Steam dataset with:
- 50M+ user-game interactions
- 2M+ unique games
- 200K+ active users
- 15+ different algorithms
Response times stay under 100ms because similarities are pre-computed and cached in Neo4j. The graph database becomes both the feature store and the serving layer.
What’s Next?
The modular design makes extending the system straightforward:
- Add new algorithms by implementing the
Model
interface - Incorporate new data sources through the ETL pipeline
- Scale individual services based on demand
- A/B test recommendation strategies in production
Most importantly, you can compare approaches fairly—same data, same evaluation metrics, same serving infrastructure. This is how you move from recommendation research to recommendation systems.