Writing about agent systems
Deep dives on building reliable tool calling: just-in-time auth, sandboxes, observability, and product decisions that make agents feel trustworthy.
A field guide to shipping dependable tool calling in production, from scoping and observability to rollback paths and user trust.
Why deferred authorization can make agent experiences feel smoother while still preserving clear approvals, short-lived access, and policy control.
Tool schemas shape behavior more than prompt wording does. Here is how to design interfaces that are narrow, legible, and resilient under real usage.
A practical logging model for tracing agent behavior end to end without turning your observability stack into a liability.
If your agent can execute code, isolation is not optional. Here is what a practical sandbox needs before it is ready for real users.
Prompts matter, but durable reliability comes from contracts, workflow design, and product choices users can actually feel.
Trust in agent products comes from previews, receipts, and reversible workflows—not from asking users to accept mysterious automation on faith.
A practical framework for measuring agent quality with metrics that correlate to user outcomes, workflow reliability, and operational cost.
A practical path from a promising demo to a dependable workflow, with better validation, timeouts, state handling, and user-visible recovery.
At integration scale, the hard part is not OAuth. It is lifecycle management, schema drift, support load, and the long tail of provider behavior.
A pragmatic guide to least privilege when an agent acts on a user’s behalf across tools, workflows, and shared business systems.
A tool registry is not just a catalog. It is the governance layer that keeps a growing tool surface understandable, owned, and safe to evolve.