Logo
Back to Blog
February 20269 min read

Stop Prompting Harder, Start Shipping with AGENTS.md

AI agents can produce great diffs while drifting from your repo's rules. How a short AGENTS.md entrypoint made my sessions consistent: fewer surprise refactors, more verified checkpoints.

AIToolingWorkflowEngineering

When I first started using AI agents on my CafePOS codebase, I had the same experience most people do: the agent was fast, helpful, and occasionally... chaotic.

Not malicious-chaotic. More like 'helpful teammate who reorganizes the pantry while you asked them to grab salt.'

The problem wasn't capability. The problem was consistency. Every new chat resets context, and 'do the right thing' is too vague when the repo has real rules: Clean Architecture boundaries, offline-first behavior, logging standards, and a strict verification bar.

So I made a change that sounds boring but ended up being one of the highest-leverage improvements to my workflow: I added an AGENTS.md.

This post explains what I put in it, what I deliberately kept out, and how it changed the way I collaborate with agents.

The Failure Mode: Great Diffs, Wrong Direction

In early sessions, I'd see patterns like:

  • The agent made a change that looked correct but skipped a verification run.
  • It fixed one issue by refactoring unrelated code.
  • It introduced logging that was useful but not standardized (or risked leaking data).
  • It made architectural shortcuts that would be unacceptable in a long-lived codebase.
  • It left a pile of good-but-uncommitted work, making it hard to checkpoint progress.

None of these are 'AI problems.' They're coordination problems.

Agents don't have persistent memory of my repo's culture. If I want consistent outcomes, I need consistent, repo-local instructions.

The Principle: One Entry Point, One Constitution

I decided to treat working with agents like onboarding a new engineer:

  • Give them a short entry point they can't miss.
  • Point them to the canonical rules.
  • Make the definition of 'done' a command, not a feeling.

In my repo, I already had a strong canonical doc:

  • CafePOS/docs/DEVELOPMENT_GUIDELINES.md

It explicitly states it's the single source of truth and includes the non-negotiables: verification commands, architecture rules, TDD expectations, docs policy, logging requirements, offline-first constraints, and review enforcement.

So my goal for AGENTS.md wasn't to rewrite that document. It was to ensure an agent always sees the highest-impact constraints immediately.

That meant:

  • AGENTS.md is short.
  • It points to the canonical guidelines.
  • It includes only the rules most likely to prevent expensive mistakes.

What I Put in AGENTS.md (And Why)

Here's what I ended up standardizing in c:\cafe\AGENTS.md.

1) A Single Source of Truth Rule

I explicitly list:

  • The canonical doc: CafePOS/docs/DEVELOPMENT_GUIDELINES.md
  • Supporting doc: CafePOS/CONTRIBUTING.md
  • A precedence rule: if there's conflict, the canonical doc wins

This removes ambiguity and prevents 'two docs drifting apart' from becoming a maintenance nightmare.

2) Working Directory Clarity

My repo has a root (c:\cafe) and the actual app under CafePOS/.

Agents often run commands from the wrong directory. I made that impossible to miss:

  • All app code and npm scripts live under CafePOS/.

3) Package Manager Policy

This is one of those 'small rule, huge payoff' additions.

Lockfile churn kills momentum. So I pinned:

  • Use npm (the repo has package-lock.json).
  • Don't switch to yarn/pnpm.
  • Avoid lockfile edits unless required.

4) Definition of Done With Exact Commands

This is the heart of agent reliability.

I put the verification commands in AGENTS.md verbatim, because I want them run reflexively:

  • npm run lint
  • npm run test
  • npm run typecheck
  • npm run verify

And I keep the bar strict: zero errors and zero warnings.

5) Commit Discipline After Verification

If something passes verification, it should become a stable checkpoint. Otherwise you end up stacking changes on top of uncommitted work and losing bisectability.

So I added a rule in both places:

  • In the canonical guidelines (CafePOS/docs/DEVELOPMENT_GUIDELINES.md).
  • In the agent entrypoint (AGENTS.md).

Rule:

  • After a successful npm run verify (or at minimum npm run lint + npm run test) for a coherent unit of work, create a git commit.

That one line prevents a surprising amount of mess.

6) Short-Form Architecture Guardrails

I didn't copy my full architecture doctrine into AGENTS.md. I included only the boundaries I never want crossed:

  • Domain: pure logic only (no UI / SQLite / infra imports).
  • Infrastructure: implements domain interfaces.
  • Application: orchestrates use cases.
  • UI: stays thin (presentation concerns only).

The goal is to stop the worst violations early, then defer details to the canonical doc.

7) Tests and Docs Expectations

Agents can 'finish' code without finishing the feature. I made completion explicit:

  • TDD where feasible (RED -> GREEN -> REFACTOR).
  • Bug fixes require a regression test.
  • Feature docs live at CafePOS/docs/features/<feature-name>.md and must be updated with changes.

8) Logging Requirements Plus a No-Secrets Rule

I want logs, but I want them standardized and safe.

So AGENTS.md says:

  • Use the centralized logger at CafePOS/src/shared/logger.ts.
  • Use the standardized log format.
  • No ad-hoc console.log in production code.
  • Never log secrets/sensitive data (tokens, passwords, card data, full PII). Redact instead.

That last line is non-negotiable in any professional codebase, and I want agents to treat it that way too.

9) A Small Do / Don't List

This is my 'keep it sane' section:

  • Do keep changes scoped to the request (avoid drive-by refactors).
  • Do follow existing patterns in the touched feature area.
  • Don't add dependencies without a clear justification.
  • Don't weaken type-safety or tests to make builds pass.

It's not a full policy manual. It's a vibe check with teeth.

What I Deliberately Kept Out

The temptation is to pack AGENTS.md with everything. I avoided that on purpose.

I don't put:

  • Long tutorials.
  • Deep architecture write-ups.
  • Duplicated guidelines text.
  • Rules I can't or won't enforce.

If AGENTS.md becomes a second constitution, it will drift. I'd rather have one strong constitution and one short agent entrypoint.

The Exact 'New Chat' Instruction I Use Now

This is what I want to be able to paste at the start of any new agent chat:

Follow AGENTS.md and treat CafePOS/docs/DEVELOPMENT_GUIDELINES.md as canonical.
Work from CafePOS/ and use npm.
Don't call it done until npm run lint, npm run test, npm run typecheck, and npm run verify are clean.
After successful verify (or at least lint+test) for a coherent unit, commit.
No secrets in logs.

That's enough to keep the agent aligned without turning the conversation into policy negotiation.

What Changed for Me

The impact wasn't 'the agent got smarter.'

The impact was:

  • I stopped re-explaining standards every session.
  • I got fewer surprise refactors.
  • I got more 'verified checkpoints' instead of long uncommitted streaks.
  • Review became easier because the definition of done was consistent.

In other words: I didn't reduce creativity, I reduced entropy.

If You Want to Copy This Approach

My recommendation:

  • Make AGENTS.md short and high-signal.
  • Point to one canonical guideline doc.
  • Include exact commands for verification.
  • Add at least one rule that protects your git history (commit-after-verify works well).
  • Add at least one rule that protects your users (no secrets in logs).
  • Keep the rest in the canonical doc.

That's the setup I'm using now, and it scales surprisingly well as the codebase grows.

Want to build something like this?

If you have an operational workflow that needs software (POS, inventory, payroll, auditing, automation), I can help turn it into a reliable system.