Blog

Practical articles on AI, DevOps, Cloud, Linux, and infrastructure engineering.

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

••9 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

••9 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

••9 months ago

Best Practices: Python Worker Queue Scaling Patterns

Python Worker Queue Scaling Patterns. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••9 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

••9 months ago

Best Practices: Model Serving Observability Stack

Model Serving Observability Stack. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••9 months ago

Best Practices: RAG Retrieval Quality Evaluation

RAG Retrieval Quality Evaluation. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••9 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

••9 months ago

Best Practices: Prompt Versioning and Regression Testing

Prompt Versioning and Regression Testing. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••9 months ago

Best Practices: LLM Gateway Design for Multi-Provider Inference

LLM Gateway Design for Multi-Provider Inference. Practical guidance for reliable, scalable platform operations.

Kiril Urbonas

Read article

••10 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

••10 months ago

Real-World RAG Incidents: Lessons from a Production Rollout

A field report from rolling out retrieval-augmented generation in production, including cache bugs, bad embeddings, and how we fixed them.

Kiril urbonas

Read article

Page 6 of 11 · 121 posts