Research Notes -- 004 entries · updated 2026.06
Notes on building intelligent systems.
Illustrated essays on large language models, ML efficiency, deep neural networks and state-space models.
Featured · AI Coding · LLMs · Engineering
Vibecoding, Honestly: A Failure Taxonomy for AI Coding Agents
Three ways coding agents fail on a real codebase, and a fourth that lives in you. With the cheap defense for each.
More from the Archive Idx / 003
LLM Inference in Production: A Survey of the Serving Stack in 2026
A survey of the tools, runtimes, and serving stacks available for deploying PyTorch models: from simple classifiers to frontier LLMs.
Large Language Model · NLP · Inference
Jun 21, 2026
11 min
11 min
RAG is a Search Problem. Build It Like One.
A blueprint for retrieval systems that survive production.
RAG · Retrieval · Search
May 15, 2026
8 min
8 min
How to deploy Transformers in Production with Pytorch and Triton Inference Server
Optimization techniques for deploying PyTorch models in a production setting to achieve low latency.
Large Language Model · NLP · Triton · Pytorch
Oct 03, 2024
5 min
5 min