Index — all writing

Blog

Notes on building reliable AI systems, platform engineering, and software architecture — written from inside real systems, not slideware.

Two Coding Agents, One Project Brief

I run Claude Code and Codex over the same repo, and each wants its own instructions file. Maintaining two by hand means they drift, and a stale project brief is worse than none. The fix is boring, and it's a symlink.

June 16, 2026

claude-codecodexai-tools

→

How the Creator of Claude Code Actually Uses It

Boris Cherny built Claude Code, and the way he uses it isn't about clever settings. It's about running it like a small team instead of a chat window. The handful of habits behind that, and the ones I've adopted.

June 16, 2026

claude-codeai-toolsproductivity

→

Your AI Tools Start From Zero Every Time. They Don't Have To.

AI coding tools forget everything between sessions, so they repeat mistakes and relearn your conventions over and over. Here's the learnings loop that fixes it, and exactly how I wired it into Claude Code on this blog.

June 15, 2026

ai-toolsclaude-codeknowledge-base

→

Putting AI in Front of a Platform: Lessons from Real Systems

What I've learned putting LLMs and agents on top of a Salesforce platform, where the data has rules you don't get to ignore and a wrong answer lands in the system the business runs on.

June 14, 2026

platform-engineeringenterpriseAI

→

Designing Reliable AI Agents on Top of Enterprise Platforms

An agent that can change records on a system of record is powerful and risky in equal measure. The guardrails I rely on — acting as the user, idempotent actions, a narrow toolset, human checkpoints, and real logging — to let one run safely.

June 13, 2026

agentsenterpriseplatform-engineering

→

Building Reliable LLM Features: What Production Actually Demands

An LLM feature is easy to demo and hard to trust. These are the practical habits — validating output, measuring quality, versioning prompts, and planning for wrong answers — that I rely on to make one hold up with real users.

June 12, 2026

LLMsreliabilityproduction

→

Non-Functional Requirements for AI Systems: What Staff Engineers Should Specify

Most teams spec what an AI feature should do and skip how well it has to do it. The non-functional requirements — accuracy, latency, cost, fallback, observability, governance — that decide whether it's actually production-ready.

June 11, 2026

AIarchitecturestaff-engineer

→

How I Evaluate LLM Output Without a Ground-Truth Dataset

You almost never have labeled data when you ship an AI feature. A practical way to measure quality anyway — a small hand-built set, plain assertions, a checked model-as-judge, and the production signals you already have.

June 1, 2026

LLMsevalstesting

→