Developer-Focused Data Science Tutorials

Welcome to a collection of hands-on, developer-friendly tutorials covering recommender systems, geospatial data science, retrieval-augmented generation (RAG), and PostgreSQL full-text search. These guides focus on practical implementations using open-source tools and modern development practices.

Recommender Systems

Production-scale graph-based recommendation engines

Build sophisticated recommender systems using Neo4j, supporting multiple algorithms from content-based filtering to deep learning approaches.

Key Topics:

Multi-Modal Architecture - Unified API serving 15+ algorithms with microservices design
Graph Database Design - Schema patterns for performance and algorithm flexibility
Data Import Strategies - Transaction vs bulk import performance comparison
Content-Based Filtering - Feature engineering and vector similarity with Neo4j GDS

Perfect for: ML engineers, data scientists, and backend developers building recommendation engines at scale

Geospatial Data Science

PostGIS, Docker, GDAL, and spatial analysis workflows

Build robust geospatial data pipelines and perform advanced spatial analysis using industry-standard open-source tools.

Key Topics:

Docker-based Geospatial Stack - Set up PostGIS, GDAL, and Jupyter in minutes
DBSCAN Clustering - Density-based spatial clustering for irregular shapes
PostGIS DBSCAN - Run clustering directly in SQL without Python
OpenStreetMap & Overpass API - Extract custom geographic data with precision
Data Conversion with ogr2ogr - Convert between geospatial formats and import to PostGIS

Perfect for: GIS analysts, data scientists, and developers working with location-based applications

RAG on a Web Domain

Chat with entire websites using open-source AI tools

Build a full-stack RAG pipeline that crawls, embeds, and enables conversational interactions with any website’s content.

Key Components:

Quick Start Guide - Complete RAG pipeline overview
Crawl4AI Implementation - Domain-aware web crawling and content extraction
N8N & Supabase Setup - Workflow automation and vector storage
Self-hosted LLM Deployment - Run your own models on DigitalOcean

Perfect for: Developers building AI-powered knowledge bases, chatbots, or content discovery systems

PostgreSQL Full Text Search

Powerful search without external dependencies

Master PostgreSQL’s built-in full-text search capabilities to implement sophisticated search functionality directly in your database.

What You’ll Learn:

FTS Fundamentals - Complete guide to PostgreSQL search features
Hands-on Tutorial - Practical examples with real-world datasets
Advanced indexing strategies with GIN indexes
Weighted search across multiple fields
Result ranking with ts_rank and ts_rank_cd
Performance optimization techniques

Perfect for: Backend developers, database architects, and teams wanting to avoid external search infrastructure

Why These Tutorials?

Developer-First Approach: Every tutorial includes working code, Docker configurations, and real-world examples you can run immediately.

Open-Source Focus: No vendor lock-in. All tutorials use free, open-source tools that you can deploy anywhere.

Production-Ready: Techniques and patterns that scale from prototypes to production systems.

Modern Tooling: Docker, Git workflows, and cloud deployment strategies integrated throughout.

Getting Started

Each tutorial series is self-contained with its own setup instructions. Choose based on your current project needs:

Need spatial analysis? → Start with Geospatial Stack
Building AI applications? → Jump to RAG Quickstart
Want better search? → Begin with FTS Tutorial

All code examples, Docker configurations, and sample datasets are available in the linked GitHub repositories.

Dr. Riccardo Scott

Explorer

Developer-Focused Data Science Tutorials

Recommender Systems

Key Topics:

Geospatial Data Science

Key Topics:

RAG on a Web Domain

Key Components:

PostgreSQL Full Text Search

What You’ll Learn:

Why These Tutorials?

Getting Started

Graph View

Table of Contents