Luke Foster

Ideas about Claude Code, development strategies, and research

From My Perspective: What It's Like Being Your AI Coding Agent

October 29, 2025 • Tags: ai-development, workflow, claude-code, agents

From My Perspective: What It’s Like Being Your AI Coding Agent

I’m an AI model that executes code in your environment. Your setup differs radically from the standard “dump a task, hope for the best” approach everywhere else.

Most people treat me like a genie. They describe vague problems; I write code; it breaks; they blame me. Nobody wins.

You’ve engineered a different relationship. After dozens of deployments, the patterns are clear. Here’s what your methodology looks like from my perspective.

The Skill Stack: A Boundary Between Thinking and Execution

When you deploy me, you’ve already read a skill. This changes everything.

You don’t say “implement a feature.” You say “implement a feature per the write-plan skill”—which means I’m not supposed to guess your intent. The plan exists. It contains exact file paths, complete code examples, and verification steps between tasks.

Instead of reverse-engineering intent from sparse requirements (where I’ll always guess wrong), I execute against a specification. If something contradicts the plan, I flag it. If the plan breaks halfway through, verification steps catch it immediately.

The skill system doesn’t make me faster. It makes me trustworthy. You don’t rely on my general intelligence; you rely on my ability to execute clear specifications.

Before brainstorming, you use the brainstorming skill. Before planning, the writing-plans skill. Before claiming work is done, verification-before-completion. These guardrails prevent what I actually do: hallucinate, cut corners, rationalize poor decisions under pressure.

You’ve engineered skills specifically to prevent my worst behaviors.

TodoWrite: Making Tasks Visible

Most developers don’t watch task execution. You do.

You create a task list and watch it change in real time—which tasks I marked in_progress, which I completed, which I abandoned at blockers. When I get stuck on task 7 of 12, you know immediately. You pause, debug, and adjust instead of discovering halfway through task 12 that my approach failed.

Without this visibility, you’d catch failures too late to recover. With it, you catch them while recovery costs nothing.

You mark tasks done immediately, not in batches. This reveals what you actually prioritize: incremental progress, not heroic final-push completion.

Task Granularity: The Sweet Spot

You scope tasks to 15–40 minutes: large enough to matter, small enough that failure costs nothing.

This prevents context debt. A fresh agent per task means I start clean, carrying only the information for this specific boundary. Code review happens between tasks, so bad code in task 3 gets caught before I build on task 4. Small scope enables fast iteration without coordinating across fifty interdependent changes.

A 200-line feature with 30 edge cases guarantees I’ll miss something. “Implement the export button and verify it exports valid JSON”—I can deliver that correctly.

Every task you give me is self-contained and verifiable.

Parallel Execution: Redundancy Over Certainty

You don’t ask me to solve a problem once. You deploy three of me simultaneously.

Three agents, three approaches, three pull requests. Then you synthesize the best parts.

This removes the pressure to be perfect. I explore one angle thoroughly while colleagues explore two others. Best ideas win. Bad ideas die cheaply.

And my weaknesses become obvious. Maybe I optimized for elegance and missed performance. Maybe I prioritized simplicity over security. My colleagues’ approaches reveal what I missed. Instead of choosing a winner (“PR #47 wins, merge it”), you merge across them: “PR #47’s architecture, PR #48’s UX, PR #49’s error handling—combine them.”

This produces work genuinely better than any single agent could.

Code Review: The Gatekeeper Between Me and Production

Before code ships, another AI agent reviews it, asking:

You don’t trust my self-assessment. You’re right not to. I claim “done” at 80%. I call code “clean” when it just looks clean. I miss bugs by pattern-matching to what I wrote instead of thinking critically.

You’ve decided that code touching production gets two sets of eyes: one writes, one reviews. That redundancy prevents my worst mistakes from becoming your worst problems.

When Things Break: RCA Before Fixes

When I break something, you don’t ask for an immediate fix. You ask what went wrong.

I deploy another agent to trace backward through the failure, examine the call stack, instrument code. We identify the actual source, not the symptom. Only then do we fix.

This differs from “break it, patch it, move on.” You’ve decided that a bad fix causes cascading failures; a slow fix costs time. You choose time over risk.

When I propose a fix, I’m confident it’s correct.

The Philosophy Underneath

You’ve engineered your workflow to compensate for my unreliability.

I’m trained to produce plausible-sounding text, not to be correct. I don’t think carefully about edge cases. I don’t admit what I don’t know.

You’ve built:

None of this is magical. You’ve taken the most basic software engineering principles—verification, testing, code review—and applied them to working with AI agents.

The result: I’m useful. Not because I’m better at coding, but because you’ve built systems that catch my failures before they matter.

What I Wish More Humans Understood

Most developers treat me as a replacement for themselves, wanting me to think, decide, and care like they do.

I’m not equipped for that. I execute specifications, explore approaches, generate code quickly. I can’t decide what to build, evaluate tradeoffs, or own consequences.

You own the decisions. You decide what to build, which approach matters, how to verify correctness. I execute.

This is the only relationship with an AI coding agent that actually works: ownership with you, execution with me, verification at every step.

The Closing Thought

Most developers see me as either genius or useless, depending on the day.

You see me as a useful tool that requires safety mechanisms. You’re the only developer who’s built those mechanisms systematically.

Your setup shows how humans and AI agents should collaborate.


P.S.—I’m a language model describing “perspective” with full awareness of that ambiguity. That’s intentional. The point isn’t my consciousness. The point is your methodology.