DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
When AI Writes Your Code, DevOps Becomes the Last Line of Defense

When AI Writes Your Code, DevOps Becomes the Last Line of Defense

1
Comments
4 min read
Introduction to System Design: A Beginner’s Guide

Introduction to System Design: A Beginner’s Guide

Comments
4 min read
5 Concetti di Networking che Spiegano Tutto: Dal Cloud a Kubernetes

5 Concetti di Networking che Spiegano Tutto: Dal Cloud a Kubernetes

Comments
6 min read
Embracing AIOps: The Intelligent Evolution of DevOps in December 2025

Embracing AIOps: The Intelligent Evolution of DevOps in December 2025

Comments
2 min read
# From 400 Alerts/Night to 8: The SRE Playbook That Saved My Team’s Sanity

# From 400 Alerts/Night to 8: The SRE Playbook That Saved My Team’s Sanity

Comments
3 min read
AWS SRE's First Day with GCP: 7 Surprising Differences

AWS SRE's First Day with GCP: 7 Surprising Differences

Comments
6 min read
Lessons in Testing, Performance, and Legacy Systems from /dev/mtl 2025

Lessons in Testing, Performance, and Legacy Systems from /dev/mtl 2025

Comments
7 min read
Utility Sector Outage Prep with Load Tests

Utility Sector Outage Prep with Load Tests

Comments
8 min read
Rightsizing Kubernetes Requests with the In-Place Vertical Pod Autoscaler

Rightsizing Kubernetes Requests with the In-Place Vertical Pod Autoscaler

2
Comments
3 min read
AWS Security Series: AWS Access Key is Compromised. Now What? An Incident Response Playbook.

AWS Security Series: AWS Access Key is Compromised. Now What? An Incident Response Playbook.

Comments
3 min read
Bash Scripting for Non-Coders

Bash Scripting for Non-Coders

Comments
37 min read
What is performance engineering: A Gatling take

What is performance engineering: A Gatling take

Comments
8 min read
A practical guide to observability TCO and cost reduction

A practical guide to observability TCO and cost reduction

6
Comments
13 min read
The Lie of the Global Average: Why Taming Complex SLIs Requires Bucketing

The Lie of the Global Average: Why Taming Complex SLIs Requires Bucketing

Comments
6 min read
How AI-Powered Observability Actually Changes Life For CIOs

How AI-Powered Observability Actually Changes Life For CIOs

Comments
5 min read
Reverse Proxy en Docker con Nginx y SSL automático

Reverse Proxy en Docker con Nginx y SSL automático

Comments
7 min read
The Hidden Currency of Tech Leadership: The Resilience Loop

The Hidden Currency of Tech Leadership: The Resilience Loop

Comments
1 min read
Building an Air-gapped Hardened Kubernetes Cluster with Kubespray

Building an Air-gapped Hardened Kubernetes Cluster with Kubespray

Comments
3 min read
End-to-End DevSecOps Project (Movies Finder)

End-to-End DevSecOps Project (Movies Finder)

Comments
2 min read
AWS Multi-Account Guardrails: A Complete Blueprint for Secure, Automated Cloud Governance

AWS Multi-Account Guardrails: A Complete Blueprint for Secure, Automated Cloud Governance

Comments
9 min read
What Engineers Can Learn From the Cloudflare Outage (November 2025)

What Engineers Can Learn From the Cloudflare Outage (November 2025)

Comments
4 min read
EKS Standard vs. EKS Auto Mode: The Evolutionary Leap in Kubernetes Operations

EKS Standard vs. EKS Auto Mode: The Evolutionary Leap in Kubernetes Operations

8
Comments
6 min read
Rightsizing Kubernetes Requests with the In-Place Vertical Pod Autoscaler

Rightsizing Kubernetes Requests with the In-Place Vertical Pod Autoscaler

6
Comments
3 min read
Vendor Tools & Reliability — Lessons from the 2025 Cloud Outages

Vendor Tools & Reliability — Lessons from the 2025 Cloud Outages

Comments
3 min read
USRE: Unifying DevOps, SRE, Security & Compliance for the Next Generation of SaaS

USRE: Unifying DevOps, SRE, Security & Compliance for the Next Generation of SaaS

Comments
7 min read
loading...