Fine-Tuning vs RAG vs Long-Context: A Decision Framework With Numbers
We've shipped all three patterns to production. They're not interchangeable. Here's the framework we now use to decide which approach fits a given task.
We've shipped all three patterns to production. They're not interchangeable. Here's the framework we now use to decide which approach fits a given task.
Three layers of pooling, three different jobs. We learned the hard way which to use when. Real numbers from a 8k-connection workload.
We launched Backstage in October. Six months in, 80% of services are catalogued, on-boarding takes a third of the time, and we mostly know what owns what.
We deployed the same edge function on both platforms and measured for a quarter. Where each wins, where each loses, and the surprises along the way.
We started using eBPF tooling for ad-hoc production debugging six months ago. Three real incidents where it cut investigation time from hours to minutes.
We invalidate ~6% of LLM outputs before they reach a downstream system. Here's how we structure prompts and validators to catch malformed responses early.
A two-line config change to an Argo Rollouts analysis template caught a regression that would have cost ~$40k in API spend before we noticed. Here's the pattern.
We ran Pulumi in TypeScript and Terraform in HCL side by side across 60+ services. Each won different categories of work. Here's the breakdown.
We deleted every static GCP service account key in our org over six weeks. Here's the migration plan, the gotchas, and the policies we now enforce.
Three production OOM incidents that taught us how kubelet, containerd, and the kernel actually decide which process dies. With debugging commands you'll wish you had earlier.
Bills hit $3,400/mo for runner minutes. We moved to self-hosted on EKS spot. The savings were real; the surprises were too.
We ran the same RAG workload across three vector stores for a quarter each. Here's what we learned about latency, cost, and operational overhead.
Every hook on this list caught a bug or a security issue in the last twelve months. The configs are short. The savings have been considerable.