Archive
Writing
Tutorials, teardowns and field notes across software, firmware and the machines between them.
2026
7
AI
Consumer GPUs are bandwidth-bound, not precision-bound. Here’s the exact VRAM-to-quant lookup table that maximizes tokens/sec without crossing the perceptible quality threshold.
Jun 7, 20267 min read
AI
Stop comparing local LLM engines on tokens per second. Pick the one that matches your actual bottleneck: setup friction, KV-cache scheduling, or memory bandwidth.
Jun 6, 20266 min read
AI
NVIDIA's RTX Spark packs 128GB of unified memory, but ~300 GB/s bandwidth caps inference throughput—here's the math on what you can actually run locally versus the cloud.
Jun 6, 20266 min read
Firmware
A careful, repeatable workflow for replacing stock firmware — with a recovery path for every step.
May 28, 20267 min read
AI
Stop thinking of the context window as memory. Think of it as a desk — finite, and everything competes for space.
May 20, 20266 min read
Open Source
You do not need a SaaS subscription to understand your users. A small stack gets you most of the way.
May 11, 20268 min read
Frameworks
Automatic memoisation sounds like magic. Here is the unglamorous reality of what it does and does not do.
May 2, 20265 min read