Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

••September 14, 2024

Production Playbook: Kubernetes Secrets and External Vault Integration

Kubernetes Secrets and External Vault Integration. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••September 11, 2024

Production Playbook: Python Worker Queue Scaling Patterns

Python Worker Queue Scaling Patterns. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••September 7, 2024

Production Playbook: Model Serving Observability Stack

Model Serving Observability Stack. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••September 3, 2024

Production Playbook: RAG Retrieval Quality Evaluation

RAG Retrieval Quality Evaluation. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 30, 2024

Production Playbook: Prompt Versioning and Regression Testing

Prompt Versioning and Regression Testing. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 26, 2024

Production Playbook: LLM Gateway Design for Multi-Provider Inference

LLM Gateway Design for Multi-Provider Inference. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 23, 2024

Production Playbook: Kernel and Package Patch Management

Kernel and Package Patch Management. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 19, 2024

Production Playbook: Systemd Service Reliability Patterns

Systemd Service Reliability Patterns. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 15, 2024

Production Playbook: Linux Performance Baseline Methodology

Linux Performance Baseline Methodology. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 10, 2024

Production Playbook: Cloud Disaster Recovery Runbook Design

Cloud Disaster Recovery Runbook Design. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 6, 2024

Production Playbook: AWS Cost Control with Tagging and Budgets

AWS Cost Control with Tagging and Budgets. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••August 3, 2024

Production Playbook: Ansible Role Design for Large Teams

Ansible Role Design for Large Teams. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

Page 17 of 23 · 274 posts