A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.
Our worst incident of last year started with a simple question: “Why is there an EC2 instance we can't find in Terraform?”
```bash terraform plan -detailed-exitcode || echo "Drift detected" ```
Drift still happens, but on-call no longer learns about it at the worst possible moment.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
SLO-Based Monitoring for APIs. Practical guidance for reliable, scalable platform operations.
A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.
Explore more articles in this category
A real-world Terraform module version pinning guide for platform teams that want safer upgrades, clearer ownership, and fewer broken pipelines after shared module releases.
A practical Terraform state isolation guide built from a real environment-mixing incident, with patterns for safer backends, clearer ownership, and lower blast radius.
This infrastructure documentation as code guide shows how a platform team moved runbooks, ownership maps, and architecture decisions into versioned workflows that people actually trusted.