Writing

Notes from the workbench

Problems I hit, and how I solved them. Mostly multi-agent systems, model verification, and quantitative evaluation. Written so the next person doesn't lose the week I lost.

Trust via Harness: getting useful work out of models that lie

Local and low-cost models fabricate at the reasoning layer. The fix isn't a better model — it's a harness the model can't talk its way past: an isolated sandbox, a deterministic honesty gate, and a real-model review.

multi-agentllmengineeringverification