Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Kubernetes Cost Optimization for Teams: FinOps Tactics That Actually Work

Cut Kubernetes spend without hurting reliability using a practical FinOps playbook for rightsizing, autoscaling guardrails, showback, and weekly waste cleanup.

Kiril Urbonas

Read article

••last month

SRE Error Budgets in Practice: Shipping Fast Without Burning Reliability

A practical way to define SLOs and error budgets, connect them to release decisions, and avoid reliability debates without data.

Kiril Urbonas

Read article

••last month

Platform Engineering with Backstage: Build a Useful Developer Portal

How to implement Backstage with real templates, scorecards, and golden paths so internal platform work reduces delivery friction.

Kiril Urbonas

Read article

••last month

GitHub Actions for Monorepos: Fast CI Without Pipeline Chaos

A practical pattern for monorepo CI with path filters, matrix builds, caching, and deployment guards that keep feedback fast as teams scale.

Kiril Urbonas

Read article

••last month

Azure DevOps Best Practices in 2026: Build Pipelines You Can Trust

A production-focused guide to Azure DevOps: standardized YAML templates, secure service connections, rollout safety, and measurable delivery reliability.

Kiril Urbonas

Read article

••last month

AI Best Practices in 2026: Shipping Reliable Systems, Not Demo Magic

A practical production playbook for AI systems: evaluation gates, guardrails, observability, cost control, and reliable release management.

Kiril Urbonas

Read article

••last month

AI Best Practices for Engineering Teams: From Prompt Experiments to Platform Discipline

A practical field manual for engineering teams who want AI features that survive real users, incidents, and budgets — not just demo day.

Kiril Urbonas

Read article

••last month

Operational Checklist: AI Inference Cost Optimization

AI Inference Cost Optimization. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••last month

What We Learned Running Weekly Game Days on Our CI/CD Pipeline

Practical game day scenarios for CI/CD: broken rollbacks, permission issues, and slow feedback loops—and how we fixed them.

Kiril urbonas

Read article

••last month

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

••last month

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

Kiril urbonas

Read article

••last month

Operational Checklist: SLO-Based Monitoring for APIs

SLO-Based Monitoring for APIs. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

Page 4 of 46 · 544 posts