Databases & Data Engineering Mastery
Go beneath ActiveRecord — from SQL foundations to database internals.
A complete databases and data engineering curriculum for an experienced Rails engineer who uses PostgreSQL daily through ActiveRecord but wants to go deep beneath the ORM. Built on 13 owned database books, 1 purchased video course, and reinforced with free resources from PostgreSQL official docs, Use The Index Luke, and pgexercises.com. The path runs from SQL foundations through database internals to production DBA expertise and data pipelines.
Databases & Data Engineering Media Track #
Companion to: DATABASES_MASTERY_CURRICULUM.md
Purpose: Video lectures, YouTube channels, talks, podcasts, and tutorials paired to each module. Databases are best learned by watching experts explain what happens beneath your ActiveRecord queries -- EXPLAIN plans, B-tree walks, WAL internals, and production war stories.
Last updated: April 30, 2026
How to Use This File #
-
Watch alongside reading. Andy Pavlo's lectures while reading Database Internals. Markus Winand's talks while reading Use The Index Luke.
-
Break weeks. Between modules, watch a Fireship explainer or a production war story talk.
-
Mood tags: (Technical), (Inspiring), (Historical), (Fun), (Deep Dive), (Production)
Module 0: SQL Foundations #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Fireship: SQL Explained in 100 Seconds | Video | 2 min | Fun | Fastest possible SQL overview. |
| SQLBolt | Interactive | 2-4 hrs | Technical | Best interactive SQL tutorial. Do every exercise. |
| CMU Database Systems -- Intro (Andy Pavlo) | Lecture | 1.5 hrs | Technical | THE database course. Start with Lecture 1 for relational model foundations. |
| Fireship: 7 Database Paradigms | Video | 10 min | Fun | Quick tour of database types. Good mental map before going deep. |
Module 1: PostgreSQL Deep Dive #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Postgres Official YouTube | YouTube Channel | varies | Technical | Official PGConf recordings. Primary source for Postgres talks. |
| Christophe Pettus: PostgreSQL When It's Not Your Job | Talk | 45 min | Production | THE talk for Rails devs who need Postgres skills without becoming a DBA. Watch FIRST. |
| Christophe Pettus: PostgreSQL Worst Practices | Talk | 40 min | Fun | Learn what NOT to do. Memorable and practical. |
| Citus Data Webinars | YouTube Channel | varies | Technical | PostgreSQL scaling, partitioning, and distributed Postgres talks. |
| PGConf Talks | Talks | varies | Deep Dive | Annual PostgreSQL conference recordings. Search by topic as needed. |
Module 2: Schema Design & Data Modeling #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Database Normalization Explained (Decomplexify) | Video | 15-30 min | Technical | Visual explanation of 1NF through BCNF. Watch before designing any schema. |
| Entity Relationship Diagrams Explained | Video | 20 min | Technical | ER diagrams are the lingua franca of database design conversations. |
| Andy Pavlo: Database Design (CMU 15-445) | Lecture | 1.5 hrs | Deep Dive | Academic treatment of schema design, normalization, and denormalization tradeoffs. |
Module 3: Query Performance & Optimization #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Markus Winand: Indexes -- The Key to Performance | Talk | 45 min | Technical | Use The Index Luke author. THE talk on indexing. Watch before reading his book. |
| EXPLAIN ANALYZE Walkthrough (PostgreSQL) | Video | 30 min | Technical | Learn to read query plans. The single most practical database skill. |
| Vlad Mihalcea YouTube | YouTube Channel | 10-30 min each | Deep Dive | JPA/Hibernate focus but database concepts (locking, isolation, indexing) are universal. |
| Markus Winand: Modern SQL | Talk | 40 min | Technical | SQL features most devs don't know exist: window functions, CTEs, lateral joins. |
Module 4: Database Internals #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| CMU 15-445: Database Systems (Andy Pavlo) -- Full Course | Lecture Series | 1.5 hrs each | Deep Dive | THE database internals course. Free on YouTube. Cover storage, B-trees, buffer pools, query processing, concurrency control, recovery. Watch lectures 3-14 alongside Database Internals (Petrov). |
| Alex Petrov: Database Internals Talks | Talks | 40-60 min | Deep Dive | Database Internals author. Covers storage engines, LSM trees, B-trees. |
| Andy Pavlo: MVCC Explained | Lecture | 1.5 hrs | Deep Dive | How PostgreSQL handles concurrent transactions. Essential for understanding Rails locking. |
Module 5: NoSQL & Polyglot Persistence #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Fireship: NoSQL Explained in 7 Minutes | Video | 7 min | Fun | Quick NoSQL overview before going deep. |
| MongoDB University | Free Course | varies | Technical | Official MongoDB courses. M001 (Basics) and M320 (Data Modeling) are excellent. |
| Redis University | Free Course | varies | Technical | Official Redis courses. RU101 (Introduction) and RU301 (Running Redis in Production). |
| Martin Kleppmann: NoSQL and Consistency | Talk | 45 min | Deep Dive | DDIA author explains CAP theorem realities. Cuts through NoSQL marketing. |
Module 6: Data Pipelines & Streaming #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Martin Kleppmann: Turning the Database Inside-Out | Talk | 45 min | Inspiring | THE talk on event streaming and materialized views. Watch FIRST. Changes how you think about data flow. |
| Kafka Summit Talks | Talks | varies | Technical | Annual Kafka conference. Search for "Kafka 101" for intros, or specific topics. |
| Debezium: Change Data Capture Talks | Talks | 30-45 min | Technical | CDC from PostgreSQL to Kafka. Directly relevant to Rails event sourcing patterns. |
| Martin Kleppmann: Event Sourcing and Stream Processing | Talk | 50 min | Deep Dive | How event logs become the source of truth. |
Module 7: Production DBA Skills #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| PGConf: Production PostgreSQL | Talks | varies | Production | Real production Postgres war stories. Search for replication, backup, monitoring. |
| strong_migrations Gem (Andrew Kane) | Talk/Demo | 20 min | Technical | Safe database migrations in Rails. Directly applicable to your daily work. |
| Zero-Downtime Migrations in Rails | Talks | 30-40 min | Production | How to migrate schemas without taking your app down. |
| Christophe Pettus: PostgreSQL Replication | Talk | 45 min | Deep Dive | Streaming replication, logical replication, and failover explained clearly. |
Module 8: Capstone & Specialization #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Shopify: Database Scaling Stories | Talk | 30-45 min | Inspiring | How Shopify scaled their database layer. Rails at massive scale. |
| GitHub: MySQL to Vitess Migration | Talk | 40 min | Production | Real migration story at GitHub scale. |
| Stripe: Online Migrations at Scale | Talk | 35 min | Production | How Stripe handles schema changes with zero downtime. |
Podcasts #
| Podcast | Focus | Why |
|---|---|---|
| Postgres FM | PostgreSQL | Weekly PostgreSQL podcast. Hosted by Michael Christofides and Nikolay Samokhvalov. Deep Postgres topics. |
| Software Engineering Daily | Database episodes | Search for PostgreSQL, Redis, Kafka, database internals episodes. High quality interviews. |
| Data Engineering Podcast | Data engineering | Covers databases, pipelines, streaming, and data infrastructure. |
| The Changelog | Database episodes | Search for PostgreSQL, SQLite, database-related episodes. |
YouTube Channels (Subscribe) #
| Channel | Focus | Why |
|---|---|---|
| CMU Database Group | Database internals | Andy Pavlo's full courses. THE academic database resource. Free. |
| Hussein Nasser | Database engineering | Deep dives on database internals, networking, and backend engineering. Excellent ACID, indexing, and replication content. |
| Fireship | Database overviews | Quick, fun database explainers. Good for initial exposure to new database types. |
| PlanetScale | MySQL, Vitess, scaling | Database scaling content from the Vitess-backed company. |
| Citus Data | PostgreSQL, distributed | PostgreSQL scaling, partitioning, and Citus extension talks. |
Watch one talk per week. Subscribe to Postgres FM. Databases reward depth over breadth -- understand one database engine deeply (PostgreSQL) before branching out.
Databases & Data Engineering Community Guide #
Companion to: DATABASES_MASTERY_CURRICULUM.md
Purpose: Newsletters, blogs, forums, conferences, and open-source projects for databases and data engineering. PostgreSQL has the best documentation and one of the most helpful communities in all of software. The database world rewards going deep -- most of what you need is in the Postgres docs and mailing lists.
Last updated: April 30, 2026
Newsletters & Blogs #
| Blog/Newsletter | Author/Source | Focus | Why |
|---|---|---|---|
| Postgres Weekly | Cooperpress | PostgreSQL news | THE Postgres newsletter. Subscribe. Weekly roundup of posts, tools, and releases. |
| DB Weekly | Cooperpress | All databases | Broader database newsletter. Good for staying aware of the non-Postgres world. |
| Use The Index, Luke | Markus Winand | SQL indexing | Free online book on SQL indexing and query optimization. THE indexing resource. Read cover to cover. |
| Vlad Mihalcea Blog | Vlad Mihalcea | Database performance | Deep posts on ACID, locking, isolation levels, and connection pooling. JPA-focused but concepts are universal. |
| Craig Kerstiens Blog | Craig Kerstiens | PostgreSQL for app devs | Former Citus/Heroku Postgres. Practical Postgres posts aimed at application developers, not DBAs. |
| Brandur Leach Blog | Brandur Leach | Postgres, Stripe | Beautiful writing about Postgres, transactional workflows, and building reliable systems. His ACID series is essential. |
| Haki Benita | Haki Benita | SQL, query optimization | Django-focused but the SQL and Postgres concepts transfer directly to Rails. Exceptional query optimization posts. |
| Planet PostgreSQL | Community aggregator | PostgreSQL | Aggregates Postgres blog posts from across the community. Good for discovery. |
| Alex Petrov Blog | Alex Petrov | Database internals | Database Internals book author. Deep posts on storage engines, distributed systems, and consensus. |
Forums & Communities #
| Community | Platform | Focus | Why Join |
|---|---|---|---|
| r/PostgreSQL | PostgreSQL | Active community. Performance questions, version news, tool recommendations. | |
| r/Database | All databases | Broader database discussions. Good for comparing technologies. | |
| r/SQL | SQL | SQL help and discussion. Good for tricky query problems. | |
| PostgreSQL Mailing Lists | Mailing Lists | PostgreSQL | pgsql-general and pgsql-performance are gold. Core developers answer questions directly. |
| DBA Stack Exchange | Stack Exchange | All databases | High quality Q&A for database administration. Postgres and MySQL dominate. |
| PGSlack (Postgres Slack) | Slack | PostgreSQL | Active Postgres community. Good for real-time help. |
| Redis Community | Various | Redis | Discord, forums, and mailing lists for Redis. |
Conferences (Recordings Available Free) #
| Conference | Focus | How to Access | Why |
|---|---|---|---|
| PGConf | PostgreSQL | YouTube (free) | THE PostgreSQL conference. Production talks, internals, new features. |
| PGDay | PostgreSQL | YouTube (free) | Regional PostgreSQL conferences (Europe, Asia, etc.). More intimate, excellent talks. |
| Percona Live | MySQL, PostgreSQL, MongoDB | YouTube (free) | Open-source database conference. Good cross-database perspective. |
| CMU Database Symposium | Database research | YouTube (free) | Academic database research. Andy Pavlo hosts. Cutting-edge database ideas. |
| DataCouncil | Data engineering | YouTube (free) | Data engineering conference. Pipelines, streaming, and infrastructure talks. |
| Kafka Summit | Kafka, streaming | YouTube (free) | Annual Kafka conference. Streaming architecture, CDC, and event-driven design. |
Open-Source Projects to Study #
Study these for their architecture, patterns, and how they solve database problems.
Database Engines #
| Project | What It Is | Why Study |
|---|---|---|
| PostgreSQL | The database | Read specific subsystems: src/backend/access/heap (storage), src/backend/optimizer (query planning), src/backend/access/transam (MVCC). Don't try to read all of it. |
| CockroachDB | Distributed SQL | Go-based distributed PostgreSQL-compatible database. Study its Raft implementation and distributed transactions. |
| TiDB | Distributed SQL | MySQL-compatible distributed database. Study its storage engine (TiKV) architecture. |
| Vitess | MySQL sharding | YouTube and GitHub use it. Study its sharding and connection pooling architecture. |
Rails Gems #
| Gem | What It Does | Why Study |
|---|---|---|
| strong_migrations | Safe migrations | Prevents dangerous migrations in production. Read the source -- it's a masterclass in Rails migration safety. USE THIS IN EVERY PROJECT. |
| PgHero | Postgres dashboard | Database performance dashboard for Rails. Study how it queries pg_stat_statements and pg_stat_user_tables. |
| Scenic | Database views | Versioned database views in Rails. Study how it manages view migrations. |
| Marginalia | Query comments | Adds comments to SQL queries showing the controller/action. Invaluable for debugging slow queries in production. |
| pg_search | Full-text search | PostgreSQL full-text search for Rails. Study how it builds tsvector queries. |
Tools #
| Tool | What It Does | Why Study |
|---|---|---|
| pgcli | Better psql | Auto-completion and syntax highlighting for PostgreSQL CLI. Install and use daily. |
| pgAdmin | Postgres GUI | Official PostgreSQL GUI. Useful for visual query plans and server monitoring. |
| pg_stat_statements | Query statistics | Built-in Postgres extension. THE tool for finding slow queries. Enable in every database. |
| pgbench | Benchmarking | Built-in Postgres benchmarking tool. Use for testing configuration changes and hardware. |
| pgloader | Data migration | Migrate from MySQL, SQLite, or CSV to PostgreSQL. Useful for database migrations. |
People to Follow #
Key voices in the database world. Follow their blogs, talks, and social media.
| Person | Known For | Where to Find |
|---|---|---|
| Andy Pavlo | CMU database courses, database research | YouTube (CMU Database Group), Twitter |
| Markus Winand | Use The Index Luke, Modern SQL | modern-sql.com, use-the-index-luke.com |
| Christophe Pettus | PostgreSQL operations, conference talks | PGConf talks, Twitter |
| Craig Kerstiens | PostgreSQL for developers | craigkerstiens.com |
| Brandur Leach | Postgres, reliable systems | brandur.org |
| Alex Petrov | Database Internals book | databass.dev |
| Martin Kleppmann | DDIA, stream processing | martin.kleppmann.com |
| Nikolay Samokhvalov | Postgres FM, Postgres.ai | postgres.fm |
| Andrew Kane | strong_migrations, PgHero, Blazer | GitHub |
Essential Resources #
The resources everyone in the database world references.
| Resource | Type | Why |
|---|---|---|
| Use The Index, Luke | Free online book | THE SQL indexing resource. Written by Markus Winand. Covers B-tree indexes, query plans, and optimization across all databases. Read this before any performance work. |
| pgexercises.com | Interactive exercises | Free PostgreSQL exercises. Covers joins, aggregation, window functions, recursive queries. Do all of them. |
| PostgreSQL Official Docs | Documentation | The best database documentation in existence. Not an exaggeration. The chapters on indexing, MVCC, and query planning are better than most textbooks. |
| SQLZoo | Interactive tutorial | Step-by-step SQL tutorial with live exercises. Good for filling gaps in SQL knowledge. |
| Mode Analytics SQL Tutorial | Tutorial | Practical SQL tutorial with real datasets. Covers basics through window functions and performance tuning. |
| CMU 15-445 Course Page | Course materials | Lecture notes, assignments, and projects from THE database internals course. Free. |
PostgreSQL has the best documentation of any database. When in doubt, read the docs first. The mailing lists are where core developers answer questions directly -- use them. Join Postgres FM for weekly listening and Postgres Weekly for weekly reading.