Engineering
Engineering
How Kernel is built. Practical notes on RAG, retrieval, and running AI in production.
12 min read
How we cut LLM costs 87% by routing only the generation step to Claude
A practical case for per-stage model selection in production RAG.
Read10 min read
Why we ship Self-RAG retries — and what happened when we didn't
A measurement bug that lived in our streaming pipeline for months, what users actually saw, and how we fixed it without sacrificing latency.
Read