_d
devops/ness
Blog
Reading ListAbout
Subscribe

Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Tag: #monitoringClear filters
How We Stopped Terraform Drift from Surprising On-Call
••11 months ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

KU
Kiril urbonas
Read article
Best Practices: Kubernetes Cluster Upgrade Strategy
••11 months ago

Best Practices: Kubernetes Cluster Upgrade Strategy

Kubernetes Cluster Upgrade Strategy. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
A Pragmatic Multi-Region Strategy for Small Teams
••11 months ago

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

KU
Kiril urbonas
Read article
Troubleshooting: AI Inference Cost Optimization
••11 months ago

Troubleshooting: AI Inference Cost Optimization

AI Inference Cost Optimization. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
How We Stopped Terraform Drift from Surprising On-Call
••11 months ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

KU
Kiril urbonas
Read article
Troubleshooting: SLO-Based Monitoring for APIs
••11 months ago

Troubleshooting: SLO-Based Monitoring for APIs

SLO-Based Monitoring for APIs. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
A Pragmatic Multi-Region Strategy for Small Teams
••11 months ago

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

KU
Kiril urbonas
Read article
Troubleshooting: Secure Container Supply Chain Controls
••11 months ago

Troubleshooting: Secure Container Supply Chain Controls

Secure Container Supply Chain Controls. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
How We Stopped Terraform Drift from Surprising On-Call
••11 months ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

KU
Kiril urbonas
Read article
A Pragmatic Multi-Region Strategy for Small Teams
••11 months ago

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

KU
Kiril urbonas
Read article
How We Stopped Terraform Drift from Surprising On-Call
••11 months ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

KU
Kiril urbonas
Read article
Troubleshooting: Incident Response for Platform Teams
••April 14, 2025

Troubleshooting: Incident Response for Platform Teams

Incident Response for Platform Teams. Practical guidance for reliable, scalable platform operations.

KU
Kiril Urbonas
Read article
Page 15 of 25 · 291 posts
Previous
1...141516...25
Next