From research prototype to production—design APIs that survive contact with reality

Building a production-ready API for serving machine learning recommendations requires careful consideration of performance, scalability, and maintainability. This step-by-step tutorial explores how the Steam recommender system implements a robust FastAPI service that serves multiple recommendation algorithms whilst maintaining high availability and operational excellence.

Step 1: FastAPI Architecture Foundation

Setting Up the Core Application

Start with a well-structured FastAPI application that establishes clear patterns from the beginning:

# app/main.py
from fastapi import FastAPI, HTTPException, Query
from fastapi.responses import JSONResponse
import logging
 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
app = FastAPI(
    title="Steam Recommendation API",
    description="Graph-based recommendation system for Steam games, friends, and groups",
    version="1.0.0",
)

Key architectural decisions:

  • Centralised logging configuration
  • Descriptive API metadata for automatic documentation
  • Clear version management
graph TD
    A[FastAPI Application] --> B[Exception Handlers]
    A --> C[Health Checks]
    A --> D[Recommendation Endpoints]
    A --> E[Data Endpoints]
    
    D --> F[Games API]
    D --> G[Friends API]
    D --> H[Groups API]
    
    E --> I[Search API]
    E --> J[Top Items API]
    E --> K[User Overview API]

Exception Handling Strategy

Implement comprehensive exception handling at the application level:

@app.exception_handler(Exception)
async def general_exception_handler(request, exc):
    """Handle unexpected exceptions."""
    logger.error(f"Unexpected error: {exc}")
    return JSONResponse(
        status_code=500,
        content={"detail": "Internal server error"}
    )

Benefits:

  • Prevents internal error details from leaking to clients
  • Ensures consistent error response format
  • Comprehensive logging for debugging

Step 2: Pydantic Interface Design

Defining Response Models

Create type-safe interfaces using Pydantic models:

# app/endpoints/interfaces.py
from typing import Any, Dict, List
from pydantic import BaseModel
 
class User(BaseModel):
    steamid: int
    personaname: str = None
    props: Dict[str, Any] = None  # Flexible property container
 
class Game(BaseModel):
    appid: int
    title: str = None
    props: Dict[str, Any] = None
 
class Games(BaseModel):
    games: List[Game]
 
class UserOverview(BaseModel):
    user: User
    top_games: List[Game] = None
    top_genres: List[Genre] = None
    friends: List[User] = None
    top_groups: List[Group] = None

Design principles:

  • Optional fields with sensible defaults
  • Flexible props field for extensibility
  • Composite models for complex responses
  • Type safety throughout the API

Input Validation Patterns

Use FastAPI’s built-in validation:

@app.get("/v1/recommendations/games/{steamid}", response_model=Games)
def get_game_recommendations(
    steamid: str, 
    model: ModelNameApp, 
    n: int = Query(default=10, ge=1, le=100, description="Number of recommendations")
) -> Games:
    """Get game recommendations for a user."""

Validation features:

  • Path parameter validation
  • Query parameter constraints (ge, le)
  • Enum validation for model names
  • Automatic OpenAPI documentation

Step 3: Model Factory Pattern

Implementing the Factory

Create a flexible model factory for algorithm selection:

# app/endpoints/recommendations/model_factory.py
from typing import Type
from endpoints.recommendations.base import Recommendation
 
class ModelFactory:
    def __init__(self):
        self._model = {}
 
    def register_model(self, model_name: ModelName, model: Type[Recommendation]):
        self._model[model_name] = model
 
    def get_model(self, model_name: str) -> Recommendation:
        model = self._model.get(model_name)
        if not model:
            raise ValueError(f"No model registered for {model_name}")
        return model()
 
# Global factory instance
model_factory = ModelFactory()
 
# Register models
model_factory.register_model(
    ModelNameApp.apps_collaborative_item_based,
    RecGamesCollaborativeItemBased
)
model_factory.register_model(
    ModelNameApp.apps_mf_als_weighted,
    RecGamesMFWALS
)

Model Registration Strategy

graph LR
    A[Model Factory] --> B[Content-Based Models]
    A --> C[Collaborative Models]
    A --> D[Matrix Factorisation]
    A --> E[FastRP Models]
    
    B --> F[Simple Content]
    B --> G[KNN Features]
    
    C --> H[Item-Based]
    C --> I[User-Based]
    
    D --> J[ALS/WALS]
    D --> K[BPR/LMF]
    
    E --> L[Direct FastRP]
    E --> M[Item-Based FastRP]

Benefits:

  • Easy addition of new algorithms
  • Runtime model switching
  • Clear separation of concerns
  • Type-safe model instantiation

Step 4: Database Connection Management

Centralised Connection Handling

Implement robust database connectivity:

# app/endpoints/base.py
from neo4j import Driver
from steam_recsys_common import get_neo4j_connection
 
class NeoQuery:
    """Base class for Neo4j database operations."""
    
    def __init__(self) -> None:
        self._conn: Driver = None
 
    @property
    def conn(self) -> Driver:
        if self._conn is None:
            self._conn = get_neo4j_connection()
        return self._conn
 
    def _cypher_to_dict(self, query: str, **params) -> List[Dict[str, Any]]:
        """Execute Cypher query and return results as dictionaries."""
        logger.info(f"Query params: {params}")
        with self.conn.session() as session:
            result = session.run(query, **params)
            return [record.data() for record in result]

Connection Configuration

Use environment variables for configuration:

# Environment variables
NEO4J_HOST=localhost
NEO4J_AUTH=neo4j/password

NOTE: If connecting from within the same docker-compose network, use the name of the Neo4J service as hostname: NEO4J_HOST=neo4j

Connection features:

  • Lazy connection initialisation
  • Connection pooling via Neo4j driver
  • Robust retry logic in steam_recsys_common
  • Environment-based configuration

Step 5: Recommendation Serving Patterns

Real-Time Query Execution

Implement real-time recommendation generation:

@app.get("/v1/recommendations/games/{steamid}", response_model=Games)
def get_game_recommendations(
    steamid: str, 
    model: ModelNameApp, 
    n: int = Query(default=10, ge=1, le=100)
) -> Games:
    """Get game recommendations for a user."""
    try:
        recommender = model_factory.get_model(model_name=model)
        response = recommender.get(steamid=steamid, n=n)
        return response
    except Exception as e:
        logger.error(f"Failed to get recommendations for {steamid}: {e}")
        raise HTTPException(status_code=500, detail="Failed to generate recommendations")

Bulk Operations Support

Handle multiple users efficiently:

@app.get("/v1/recommendations/games/bulk", response_model=List[Games])
def get_bulk_game_recommendations(
    steamids: str = Query(..., description="Comma-separated Steam IDs"),
    model: ModelNameApp = ModelNameApp.apps_collaborative_item_based_unweighted,
    n: int = Query(default=10, ge=1, le=100)
) -> List[Games]:
    """Get recommendations for multiple users."""
    try:
        steamid_list = [sid.strip() for sid in steamids.split(",") if sid.strip()]
        
        # Validation
        if not steamid_list:
            raise HTTPException(status_code=400, detail="No valid Steam IDs provided")
        if len(steamid_list) > 50:  # Rate limiting
            raise HTTPException(status_code=400, detail="Too many Steam IDs (max 50)")
        
        recommender = model_factory.get_model(model_name=model)
        response = recommender.get_many(steamids=steamid_list, n=n)
        return response
    except HTTPException:
        raise
    except Exception as e:
        logger.error(f"Failed to get bulk recommendations: {e}")
        raise HTTPException(status_code=500, detail="Failed to generate recommendations")

Step 6: Performance Optimisation

Query Template System

Use parameterised queries for performance:

-- Example recommendation query template
MATCH (n:USER {steamid:$steamid})-[p:PLAYED]->(a:APP)-[s:__SIMILAR_RELATION__]-(rec:APP)
WHERE NOT EXISTS((n)-[:PLAYED]-(rec)) AND p.playtime_forever > 0
WITH rec, sum(p.playtime_forever * s.score) / sum(p.playtime_forever) as score
ORDER BY score DESC
LIMIT $n

Optimisation strategies:

  • Query plan caching
  • Parameterised queries prevent SQL injection
  • Relationship type templating for model flexibility
  • Efficient filtering with existence checks

Request/Response Flow

sequenceDiagram
    participant Client
    participant FastAPI
    participant ModelFactory
    participant Recommender
    participant Neo4j
    
    Client->>FastAPI: GET /v1/recommendations/games/{steamid}
    FastAPI->>FastAPI: Validate parameters
    FastAPI->>ModelFactory: get_model(model_name)
    ModelFactory->>Recommender: Create instance
    FastAPI->>Recommender: get(steamid, n)
    Recommender->>Neo4j: Execute Cypher query
    Neo4j->>Recommender: Return results
    Recommender->>FastAPI: Format response
    FastAPI->>Client: JSON response

Step 7: Health Monitoring and Observability

Health Check Implementation

Add comprehensive health checks:

@app.get("/health")
async def health_check():
    """Health check endpoint."""
    return {"status": "healthy", "service": "steam-recsys-api"}

Logging Strategy

Implement structured logging:

import logging
 
# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
 
# Usage throughout the application
logger.info(f"Processing recommendation request for user {steamid}")
logger.error(f"Database connection failed: {e}")

Error Handling Patterns

Implement layered error handling:

def get_game_recommendations(steamid: str, model: ModelNameApp, n: int):
    try:
        # Business logic
        recommender = model_factory.get_model(model_name=model)
        response = recommender.get(steamid=steamid, n=n)
        return response
    except ValueError as e:
        # Client error - invalid model name
        logger.warning(f"Invalid model requested: {model}")
        raise HTTPException(status_code=400, detail=str(e))
    except ConnectionError as e:
        # Infrastructure error
        logger.error(f"Database connection failed: {e}")
        raise HTTPException(status_code=503, detail="Service temporarily unavailable")
    except Exception as e:
        # Unexpected error
        logger.error(f"Unexpected error: {e}")
        raise HTTPException(status_code=500, detail="Internal server error")

Step 8: Production Deployment Considerations

Docker Configuration

Set up production-ready containerisation:

# api/Dockerfile
FROM python:3.12-slim
 
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
 
COPY app/ ./app/
EXPOSE 8000
 
# Production server
CMD ["gunicorn", "app.main:app", "-w", "1", "-k", "uvicorn.workers.UvicornWorker", "--bind", "0.0.0.0:8000"]

Environment Configuration

Configure for different environments:

# Environment-based configuration
import os
 
class Settings:
    neo4j_host = os.getenv("NEO4J_HOST", "localhost")
    neo4j_auth = os.getenv("NEO4J_AUTH", "neo4j/password")
    log_level = os.getenv("LOG_LEVEL", "INFO")
    max_bulk_users = int(os.getenv("MAX_BULK_USERS", "50"))

Step 9: API Documentation and Testing

Automatic Documentation

FastAPI generates comprehensive documentation:

# Accessible at /docs for Swagger UI
# Accessible at /redoc for ReDoc
app = FastAPI(
    title="Steam Recommendation API",
    description="""
    Graph-based recommendation system providing:
    - Game recommendations using multiple algorithms
    - Friend and group recommendations
    - Search and discovery endpoints
    """,
    version="1.0.0",
)

Testing Strategy

Implement comprehensive testing:

# Example test structure
import pytest
from fastapi.testclient import TestClient
from app.main import app
 
client = TestClient(app)
 
def test_health_check():
    response = client.get("/health")
    assert response.status_code == 200
    assert response.json()["status"] == "healthy"
 
def test_game_recommendations():
    response = client.get(
        "/v1/recommendations/games/76561198000000000",
        params={"model": "apps_collaborative_item_based", "n": 5}
    )
    assert response.status_code == 200
    assert "games" in response.json()

Conclusion

Building a production-ready recommendation API requires balancing performance, maintainability, and operational excellence. The Steam recommender system demonstrates how to:

  • Structure APIs using FastAPI’s type system and validation
  • Manage complexity through factory patterns and modular design
  • Ensure reliability with comprehensive error handling and health checks
  • Maintain flexibility with pluggable recommendation algorithms
  • Support operations through logging, documentation, and monitoring