_d
devops/ness
Blog
Reading ListAbout
Subscribe

Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Real-World RAG Incidents: Lessons from a Production Rollout
••3 days ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
How We Stopped Terraform Drift from Surprising On-Call
••4 days ago

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

KU
Kiril urbonas
Read article
Systemd Tricks We Use to Keep Services Boring
••5 days ago

Systemd Tricks We Use to Keep Services Boring

Concrete systemd unit patterns that reduced flakiness: restart policies, resource limits, and structured logs.

KU
Kiril urbonas
Read article
A Pragmatic Multi-Region Strategy for Small Teams
••6 days ago

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

KU
Kiril urbonas
Read article
What We Learned Running Weekly Game Days on Our CI/CD Pipeline
••last week

What We Learned Running Weekly Game Days on Our CI/CD Pipeline

Practical game day scenarios for CI/CD: broken rollbacks, permission issues, and slow feedback loops—and how we fixed them.

KU
Kiril urbonas
Read article
Ansible and Infrastructure as Code: Idempotency and Best Practices
••last week

Ansible and Infrastructure as Code: Idempotency and Best Practices

Write Ansible playbooks that are idempotent, readable, and maintainable for config management.

KU
Kiril urbonas
Read article
Real-World RAG Incidents: Lessons from a Production Rollout
••last week

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

KU
Kiril urbonas
Read article
How We Stopped Terraform Drift from Surprising On-Call
••last week

How We Stopped Terraform Drift from Surprising On-Call

A real story of removing console-only changes, adding drift detection, and getting Terraform back in charge.

KU
Kiril urbonas
Read article
Systemd Tricks We Use to Keep Services Boring
••last week

Systemd Tricks We Use to Keep Services Boring

Concrete systemd unit patterns that reduced flakiness: restart policies, resource limits, and structured logs.

KU
Kiril urbonas
Read article
A Pragmatic Multi-Region Strategy for Small Teams
••last week

A Pragmatic Multi-Region Strategy for Small Teams

How a small team moved from single-region risk to a simple active/passive multi-region setup without doubling complexity.

KU
Kiril urbonas
Read article
End-of-Week Engineering: Why Smart Tech Teams Don’t Ship Major Changes on Friday
••last week

End-of-Week Engineering: Why Smart Tech Teams Don’t Ship Major Changes on Friday

A practical risk-management framework for release timing, Friday deployment policies, progressive delivery, and how elite teams protect reliability and people.

KU
Kiril Urbonas
Read article
Kubernetes Cost Optimization for Teams: FinOps Tactics That Actually Work
••2 weeks ago

Kubernetes Cost Optimization for Teams: FinOps Tactics That Actually Work

Cut Kubernetes spend without hurting reliability using a practical FinOps playbook for rightsizing, autoscaling guardrails, showback, and weekly waste cleanup.

KU
Kiril Urbonas
Read article
Page 1 of 44 · 519 posts
12...44
Next