DevOps & Infrastructure Mastery
From Docker to SRE — ship, scale, and operate production systems.
A complete DevOps and infrastructure curriculum for an experienced Rails engineer who deploys apps but has never truly understood what happens beneath the deploy command. Built on 24 owned books (22 used, 1 duplicate, 1 redundant edition skipped), KodeKloud and Simply AWS subscriptions, and reinforced with the best free documentation on the internet. The path runs from Linux fundamentals through containers, orchestration, cloud platforms, and observability to full SRE practice.
-
1
Linux Fundamentals & Shell Mastery
-
2
Networking & Security Foundations
-
3
Containers & Docker
-
4
Container Orchestration & Kubernetes
-
5
Infrastructure as Code
-
6
Cloud Platforms & Architecture
-
7
Observability & Monitoring
-
8
CI/CD, SRE & Production Operations
-
9
Streaming, Data Infrastructure & Advanced Topics
-
10
Capstone Projects
DevOps, Infrastructure & Cloud Media Track #
Companion to: DEVOPS_INFRA_MASTERY_CURRICULUM.md
Purpose: Video lectures, YouTube channels, talks, podcasts, and documentaries paired to each module. DevOps is best learned by watching production war stories, conference talks, and practitioners explain why they made the decisions they made -- not just how.
Last updated: April 30, 2026
How to Use This File #
-
Watch alongside labs. Brendan Gregg's flame graph talk while setting up Prometheus. DHH's Kamal talk while deploying your first Rails app to a VPS.
-
Break weeks. Between heavy modules (Module 3->4, Module 6->7), watch a documentary or keynote.
-
Mood tags: (Technical), (Inspiring), (Historical), (Fun), (Deep Dive), (War Story)
Module 0: Linux Fundamentals & Shell Mastery #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Revolution OS (documentary) | Documentary | 85 min | Historical | The story of Linux, GNU, and open source. Watch first -- understand the world you're entering. |
| Fireship: Linux in 100 Seconds | Video | 2 min | Fun | Fastest possible Linux overview. |
| NetworkChuck: Linux for Hackers (series) | YouTube Series | 15-30 min each | Fun | Energetic, beginner-friendly Linux tutorials. Good companion to How Linux Works. |
| Brian Kernighan: UNIX -- A History and a Memoir | Talk | 60 min | Historical | UNIX co-creator reflects on the design philosophy. Everything in DevOps traces back here. |
| Gary Explains: Linux File System | Video | 15 min | Technical | Visual explanation of the Linux filesystem hierarchy. |
Module 1: Networking & Security Foundations #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Computerphile: How the Internet Works (series) | YouTube Series | 10-15 min each | Technical | DNS, TCP/IP, TLS explained visually. Watch before reading the networking chapters. |
| LiveOverflow: Binary Exploitation / Security | YouTube Series | 10-20 min each | Deep Dive | Security concepts explained by a practitioner. Relevant to Module 1 security foundations. |
| NetworkChuck: Networking Fundamentals | YouTube Series | 15-20 min each | Fun | Subnetting, DNS, firewalls made approachable. |
| Fireship: SSL/TLS in 100 Seconds | Video | 2 min | Fun | Quick TLS mental model before diving deeper. |
| Julia Evans: Networking Zines | Zines | varies | Fun | Visual, memorable networking concepts. Her "How DNS Works" zine is legendary. |
Module 2: Containers & Docker #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Fireship: Docker in 100 Seconds | Video | 2 min | Fun | Fastest possible Docker overview. |
| TechWorld with Nana: Docker Tutorial for Beginners | Video | 3 hrs | Technical | THE Docker tutorial on YouTube. Watch alongside Docker Deep Dive. |
| Docker official: DockerCon talks | Talks | 30-45 min each | Technical | Official conference. Watch talks on multi-stage builds and security. |
| Jess Frazelle: Containers Are Not VMs (talk) | Talk | 30 min | Deep Dive | What containers actually are under the hood -- namespaces, cgroups, chroot. |
| Liz Rice: Containers From Scratch (talk) | Talk | 35 min | Deep Dive | Building a container runtime in Go in 30 minutes. Demystifies everything. |
Module 3: Container Orchestration & Kubernetes #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| TechWorld with Nana: Kubernetes Tutorial for Beginners | Video | 4 hrs | Technical | THE K8s tutorial on YouTube. Comprehensive, beginner-friendly. |
| Kelsey Hightower: Keynotes (various) | Talks | 20-45 min each | Inspiring | The most influential voice in Kubernetes. His "No Code" keynote is legendary. |
| KubeCon talks | Talks | 30-45 min each | Technical | Annual CNCF conference. Watch talks on operators, service mesh, and failure stories. |
| Fireship: Kubernetes in 100 Seconds | Video | 2 min | Fun | Quick K8s mental model. |
| Viktor Farcic: DevOps Toolkit K8s series | YouTube Series | 15-30 min each | Technical | Practical K8s patterns from a DevOps practitioner. |
Module 4: Infrastructure as Code #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| HashiCorp: Terraform talks (HashiConf) | Talks | 30-45 min each | Technical | Official Terraform conference talks. Watch the intro and advanced patterns talks. |
| Michael DeHaan: Ansible creator talks | Talk | 40 min | Historical | Ansible's creator explains the philosophy of agentless configuration management. |
| GitOps: Weaveworks and ArgoCD talks | Talks | 30 min each | Technical | GitOps pattern explained -- infrastructure changes via pull requests. |
| Fireship: Terraform in 100 Seconds | Video | 2 min | Fun | Quick IaC mental model. |
| Kelsey Hightower: Infrastructure as Code is Not the Answer | Talk | 25 min | Inspiring | A contrarian take that makes you think about what we are actually trying to solve. |
Module 5: Cloud Platforms & Architecture #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| AWS re:Invent Keynotes | Talks | 60-90 min | Inspiring | Werner Vogels keynotes are legendary. Cloud architecture thinking at the highest level. |
| Fireship: AWS in 100 Seconds | Video | 2 min | Fun | Quick AWS mental model. |
| Adrian Cantrill: AWS courses and talks | YouTube Series | varies | Technical | The best AWS educator. Deep, accurate, and practical. |
| Fireship: Serverless in 100 Seconds | Video | 2 min | Fun | Quick serverless mental model before diving into Lambda. |
| AWS This Is My Architecture (series) | YouTube Series | 5-10 min each | Technical | Real companies explain their AWS architectures. Short and practical. |
Module 6: Observability & Monitoring #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Brendan Gregg: Linux Performance Tools (talk) | Talk | 45 min | Deep Dive | THE performance talk. Flame graphs, perf, tracing tools. Watch before any monitoring work. |
| Charity Majors: Observability talks | Talks | 30-40 min each | Inspiring | Honeycomb founder. She changed how the industry thinks about observability vs monitoring. |
| OpenTelemetry talks (KubeCon) | Talks | 30 min each | Technical | The future of telemetry. Traces, metrics, logs unified. |
| Brendan Gregg: BPF Performance Analysis | Talk | 50 min | Deep Dive | eBPF for production observability. Advanced but transformative. |
| Grafana: GrafanaCon talks | Talks | 30 min each | Technical | Dashboard design, alerting patterns, Loki and Tempo deep dives. |
Module 7: CI/CD, SRE & Production Operations #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Google SRE: Keys to SRE (Ben Treynor) | Talk | 25 min | Inspiring | The founder of Google SRE explains error budgets, SLOs, and toil elimination. |
| GitHub Universe: CI/CD talks | Talks | 30 min each | Technical | GitHub Actions best practices, reusable workflows, security. |
| DHH: Kamal -- Deploy web apps anywhere (Rails World) | Talk | 30 min | Inspiring | Rails creator explains Kamal -- deploying without Kubernetes. Directly relevant to your stack. |
| Google SRE: Incident Management (talks) | Talks | 30-40 min each | War Story | Blameless postmortems, on-call practices, incident response. |
| Fireship: CI/CD in 100 Seconds | Video | 2 min | Fun | Quick CI/CD mental model. |
Module 8: Streaming, Data Infrastructure & Advanced Topics #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| Martin Kleppmann: Turning the Database Inside-Out | Talk | 45 min | Deep Dive | DDIA author explains event streaming as a fundamental architecture shift. Watch FIRST. |
| Kafka Summit talks | Talks | 30-45 min each | Technical | Production Kafka patterns, exactly-once delivery, stream processing. |
| Martin Kleppmann: Event Sourcing and Stream Processing | Talk | 40 min | Deep Dive | Connects event sourcing to stream processing. |
| Fireship: Kafka in 100 Seconds | Video | 2 min | Fun | Quick Kafka mental model. |
Module 9: Capstone Projects #
| Resource | Type | Duration | Mood | Why |
|---|---|---|---|---|
| SREcon: Production War Stories | Talks | 30-40 min each | War Story | Real incidents at scale. What broke, why, and how they fixed it. |
| Charity Majors: The Future of Ops (talk) | Talk | 35 min | Inspiring | Where DevOps/SRE is heading. Good capstone perspective. |
| KubeCon Failure Stories | Talks | 30 min each | War Story | Kubernetes production failures and lessons learned. |
| DevOpsDays: Ignite Talks | Talks | 5 min each | Fun | Fast-paced, high-energy DevOps talks. Good for capstone inspiration. |
Podcasts #
| Podcast | Focus | Why |
|---|---|---|
| Ship It! (Changelog) | DevOps, infrastructure, deployment | Production deployment stories. Practical and grounded. |
| Arrested DevOps | DevOps culture, tools, practices | Long-running DevOps podcast. Good mix of culture and technical content. |
| The New Stack Podcast | Cloud native, Kubernetes, infrastructure | Covers the CNCF ecosystem. Good for staying current. |
| DevOps Paradox | DevOps, SRE, platform engineering | Viktor Farcic and Darin Pope. Opinionated and practical. |
| Software Engineering Daily | Broad engineering, strong infra coverage | Deep-dive interviews on infrastructure topics. Listen selectively. |
YouTube Channels (Subscribe) #
| Channel | Focus | Why |
|---|---|---|
| TechWorld with Nana | Docker, K8s, DevOps | THE DevOps tutorial channel. Clear, comprehensive, beginner-friendly. |
| NetworkChuck | Linux, networking, security | Energetic, hands-on tutorials. Makes networking fun. |
| Fireship | Quick overviews, all tech | "In 100 Seconds" series gives you fast mental models for every tool. |
| Adrian Cantrill | AWS, cloud architecture | The most thorough AWS educator on YouTube. |
| DevOps Toolkit (Viktor Farcic) | K8s, GitOps, platform engineering | Opinionated, practical DevOps content with real demos. |
Watch one talk per week. Subscribe to Ship It! and Arrested DevOps. DevOps is best learned by hearing practitioners explain what went wrong in production -- and what they changed.
DevOps, Infrastructure & Cloud Community Guide #
Companion to: DEVOPS_INFRA_MASTERY_CURRICULUM.md
Purpose: Newsletters, blogs, forums, conferences, and open-source projects for DevOps and infrastructure. The DevOps community is uniquely cross-disciplinary -- developers, ops engineers, SREs, and platform engineers all share the same spaces. Understanding the community helps you learn idiomatically and stay current in a fast-moving field.
Last updated: April 30, 2026
Newsletters & Blogs #
| Blog/Newsletter | Author/Source | Focus | Why |
|---|---|---|---|
| DevOps Weekly | Gareth Rushgrove | DevOps news roundup | THE DevOps newsletter. Subscribe. Weekly curated links on tooling, culture, and practices. |
| SRE Weekly | Lex Neva | Reliability, incidents, SRE | Curated links on reliability engineering. Incident reports, postmortems, and SRE practices. |
| Last Week in AWS | Corey Quinn | AWS news, cloud costs | Hilarious and informative. Nobody covers AWS billing and announcements better. |
| Brendan Gregg's Blog | Brendan Gregg | Performance, observability, eBPF | THE performance engineering blog. Flame graphs, BPF tools, and systems analysis. |
| Charity Majors' Blog | Charity Majors | Observability, engineering management | Honeycomb founder. Changed how the industry thinks about observability. Unfiltered and sharp. |
| Julia Evans (b0rk) | Julia Evans | Linux, networking, systems | Makes complex systems topics accessible through zines and blog posts. Her DNS, networking, and Linux posts are legendary. |
| Kelsey Hightower (social/talks) | Kelsey Hightower | Kubernetes, cloud native | The most influential voice in cloud native. Follow for sharp takes on infrastructure simplicity. |
| Jessie Frazelle's Blog | Jessie Frazelle | Containers, Linux, security | Deep container internals. She literally ran her desktop in Docker containers. |
| Cindy Sridharan's Blog | Cindy Sridharan | Distributed systems, observability | Author of "Distributed Systems Observability." Deep, well-researched posts on monitoring and tracing. |
| Cloudflare Blog | Cloudflare | Networking, security, edge computing | Engineering blog that reads like a textbook. Every outage report and technical post is worth reading. |
| Fly.io Blog | Fly.io | Edge computing, deployment, Elixir | Practical infrastructure posts. Their "We Moved to SQLite" and deployment strategy posts are excellent. |
| HashiCorp Blog | HashiCorp | Terraform, Vault, Consul, Nomad | Official source for IaC best practices and infrastructure patterns. |
Forums & Communities #
| Community | Platform | Focus | Why Join |
|---|---|---|---|
| r/devops | DevOps practices, tools | Large, active community. Career advice, tool comparisons, war stories. | |
| r/kubernetes | Kubernetes | K8s questions, cluster design, production issues. | |
| r/docker | Docker | Container questions, Dockerfile optimization, compose patterns. | |
| r/aws | AWS | AWS architecture, cost optimization, certification study. | |
| r/linux | Linux | Linux news and discussions. Higher signal than r/linuxquestions. | |
| r/sysadmin | Systems administration | Production war stories, automation, infrastructure management. | |
| CNCF Slack | Slack | Cloud native ecosystem | Official CNCF community. Channels for every CNCF project. |
| Kubernetes Slack | Slack | Kubernetes | Official K8s Slack. #kubernetes-users and #sig-* channels are active. |
| HashiCorp Discuss | Forum | Terraform, Vault, Consul | Official HashiCorp forum. Good for Terraform module questions and patterns. |
| DevOps Discord servers | Discord | DevOps | Multiple active servers. Good for real-time questions and discussions. |
Conferences (Recordings Available Free) #
| Conference | Focus | How to Access | Why |
|---|---|---|---|
| KubeCon + CloudNativeCon | Kubernetes, cloud native | YouTube (free) | THE cloud native conference. Thousands of talks on K8s, service mesh, observability, and GitOps. |
| AWS re:Invent | AWS, cloud architecture | YouTube (free) | Massive AWS conference. Werner Vogels keynotes and deep-dive breakout sessions. |
| HashiConf | Terraform, Vault, infrastructure | YouTube (free) | Official HashiCorp conference. IaC patterns, secrets management, service mesh. |
| DevOpsDays | DevOps culture, practices | YouTube (free) | Community-organized, global. Mix of technical talks and open spaces. Every city has its own flavor. |
| SREcon | Site reliability engineering | YouTube (free) | USENIX SRE conference. Deep SRE practices, incident management, reliability. |
| LISA | Systems administration | YouTube (free) | Classic USENIX systems conference. Deep systems administration and engineering. |
| DockerCon | Containers, Docker | YouTube (free) | Official Docker conference. Container best practices, security, and ecosystem. |
| GitOpsCon | GitOps, ArgoCD, Flux | YouTube (free) | Focused on GitOps patterns. Growing quickly. |
| Monitorama | Monitoring, observability | YouTube (free) | THE monitoring conference. Small, focused, high signal. Charity Majors, Brendan Gregg, and community. |
Open-Source Projects to Study #
Study these for their architecture, CI/CD pipelines, and operational patterns -- not just their code.
| Project | What It Is | Why Study |
|---|---|---|
| Docker/Moby | Container runtime | THE container project. Study its architecture, image layering, and network drivers. |
| Kubernetes | Container orchestration | Study its controller pattern, reconciliation loops, and API design. The operator pattern is everywhere. |
| Terraform | Infrastructure as code | Study its provider plugin architecture and state management. HCL design decisions. |
| Ansible | Configuration management | Agentless design. Study its module system and playbook patterns. |
| Prometheus | Metrics and alerting | Pull-based monitoring. Study its data model (labels, time series) and PromQL design. |
| Grafana | Visualization and dashboards | Study its plugin architecture and data source abstraction. Dashboard-as-code patterns. |
| ArgoCD | GitOps continuous delivery | Kubernetes-native GitOps. Study its sync engine and application model. |
| Traefik | Reverse proxy, load balancer | Auto-discovery of services. Study its provider pattern and middleware chain. |
| Caddy | Web server with automatic HTTPS | Automatic TLS via Let's Encrypt. Study its module system and Caddyfile design. |
| Kamal | Deploy web apps anywhere | Rails! 37signals' deployment tool. Study this deeply -- it is your stack's answer to Kubernetes. |
| cert-manager | TLS certificate management for K8s | Automates certificate lifecycle in Kubernetes. Study its issuer/certificate CRD design. |
| Istio | Service mesh | Study its sidecar proxy pattern, traffic management, and mTLS. Complex but influential. |
Certifications Worth Considering #
Not required, but useful for credibility when transitioning from pure Rails to infrastructure roles.
| Certification | Provider | Why |
|---|---|---|
| AWS Solutions Architect Associate | AWS | The most recognized cloud cert. Validates your understanding of AWS architecture patterns. |
| CKA (Certified Kubernetes Administrator) | CNCF | Hands-on, practical exam. Proves you can operate K8s clusters, not just talk about them. |
| CKAD (Certified Kubernetes Application Developer) | CNCF | Developer-focused K8s cert. More relevant for a Rails engineer deploying apps to K8s. |
| HashiCorp Terraform Associate | HashiCorp | Validates IaC fundamentals. Good signal for infrastructure-aware application engineers. |
| Linux Foundation Certified System Administrator | Linux Foundation | Proves Linux competency. Hands-on exam. |
People to Follow #
Key voices in the DevOps and infrastructure space. Follow on social media and read everything they publish.
| Person | Known For | Where to Follow |
|---|---|---|
| Kelsey Hightower | Kubernetes, cloud native advocacy | GitHub, conference talks |
| Charity Majors | Observability, engineering leadership | charity.wtf, social media |
| Brendan Gregg | Systems performance, flame graphs, eBPF | brendangregg.com |
| Julia Evans | Making systems topics accessible | jvns.ca, zines |
| Jessie Frazelle | Container internals, Linux | blog.jessfraz.com |
| Corey Quinn | AWS, cloud economics, snark | lastweekinaws.com |
| Viktor Farcic | DevOps practices, K8s, GitOps | DevOps Toolkit YouTube |
| Cindy Sridharan | Distributed systems observability | Medium |
| DHH | Rails deployment, Kamal, anti-complexity | world.hey.com/dhh |
| Liz Rice | Container security, eBPF | Conference talks, books |
Essential Talks (Watch These) #
The 10 talks everyone in the DevOps and infrastructure world references.
| Talk | Speaker | Year | Focus | Why |
|---|---|---|---|---|
| No Code (keynote) | Kelsey Hightower | 2018 | Kubernetes | Legendary satirical keynote about infrastructure complexity. The punchline lands harder the more you learn. |
| Observability -- A 3-Year Retrospective | Charity Majors | 2021 | Observability | How observability differs from monitoring. Changed the industry vocabulary. |
| Linux Performance Tools (tutorial) | Brendan Gregg | 2015 | Performance | THE systems performance reference talk. Flame graphs and the USE method. |
| Deploying Rails with Kamal (Rails World) | DHH | 2023 | Deployment | Rails creator explains Kamal. Directly relevant to your stack and career. |
| The Rails Doctrine (Omakase) | DHH | Various | Philosophy | Why Rails makes opinionated choices. The Kamal philosophy extends from this. |
| Containers From Scratch | Liz Rice | 2018 | Containers | Builds a container in Go in 30 minutes. Demystifies namespaces and cgroups. |
| 10 Deploys Per Day (Velocity 2009) | John Allspaw & Paul Hammond | 2009 | DevOps | THE talk that started the DevOps movement. Still relevant. |
| Turning the Database Inside-Out | Martin Kleppmann | 2015 | Streaming | DDIA author on event streaming as architecture. Fundamental shift in how to think about data. |
| 3 Practices for Effective SRE | Google SRE team | Various | SRE | Error budgets, SLOs, and toil elimination from the team that invented SRE. |
| Choose Boring Technology | Dan McKinley | 2015 | Architecture | Every team gets a limited number of innovation tokens. Spend them wisely. Essential wisdom for a Rails engineer. |
DevOps is a culture before it is a toolchain. Join r/devops and the CNCF Slack. Subscribe to DevOps Weekly and SRE Weekly. The community teaches you what the books cannot -- what actually breaks in production and why.