AI Workflow Discipline
When AI goes wrong in a project, the tool is usually fine. The system around it isn't. Workflow discipline is the set of habits and structures that keep AI working reliably across long projects and complex codebases — and it's where most teams fail to invest.
On this page
Instruction Files: The Highest-Leverage Investment
CLAUDE.md (or your equivalent) is always loaded into every session in the repo. One well-written instruction file shapes hundreds of future AI interactions. The time you spend writing it once is the time you stop spending on context-setting in every session.
Three tiers: CLAUDE.md for identity and invariants (these are the rules that never change), ai/instructions/ for repeatable procedures (workflows that have proven out), slash commands for one-step shortcuts (the prompts you've typed a hundred times). Don't automate before you've codified, and don't codify before you've seen the pattern work at least once in the wild.
Multi-Session Discipline
Plan in one session. Build in a fresh session. Review in a third. This is the defense against context pollution — the gradual accumulation of wrong turns, half-patched attempts, and stale assumptions that makes a long session progressively worse.
Fresh session beats better prompt, nearly every time. When you're stuck, the default move is to rephrase. The correct move is almost always to open a new session with the relevant roadmap as the opening context. You'll feel like you're losing progress. The progress was already polluted.
Context pollution is the number one project killer. The symptom is 'it was working an hour ago and now it isn't.' The cause is almost always the context, not the model.
The Law of Witnesses
One agent's confidence is noise. Two independent agents converging on the same answer is signal.
Match witness count to risk: one agent for routine work, two for high-stakes, three or more for anything truly consequential. The agent that wrote the code has motivated reasoning — it saw the context that led to those decisions and is optimizing to defend them. A fresh agent reviewing the same work has no such bias.
Sub-agents cost 3-5x normal tokens when controlled, 5-10x when unchecked. Budget for this. Use sub-agents for verification and high-stakes implementation — not routine edits. Fewer specialized agents consistently outperform many generic ones.
The Escalation Ladder
Level 1: Re-prompt with more context. If this doesn't work once, it won't work twice.
Level 2: Fresh session with context.md as the opening prompt. This is the big one — it fixes more stuck situations than any other move.
Level 3: Sub-agent review. Prompt: 'Here is what we tried, here is what failed — analyze the root cause, do not write code.' A second, independent view breaks the loop that same-agent retries cannot.
Level 4: Manual intervention. At some point, you close the laptop on the AI, read the code yourself, simplify the problem, and hand the simpler version back.
When to Put the AI Down
Five scenarios where the cost of AI speed outweighs the benefit: deep learning (do it yourself — you need to build the model, not get the answer), novel problems (training data hasn't seen it), security-critical code (auth and payments deserve your full attention), stuck loops past level 3, and team alignment moments where the conversation needs to happen between people.
Knowing when to close the laptop is part of the skill. Over-engineering is a behavioral default of AI systems; a one-line instruction in CLAUDE.md ('prefer the minimal solution that passes the tests; do not add abstractions not explicitly asked for') shifts that default for every future session.
Want to apply these frameworks to your business?