Skip to main content J Williams
Blog
9 min read
Blog

Vibe Coding: When to Let an AI Agent Drive (and When to Take the Wheel)

The term Andrej Karpathy coined for prompting an agent and accepting its output without reading every line — useful in some contexts, dangerous in others

9 min read

Vibe coding isn't a technique so much as a trust decision, made implicitly, often without noticing you've made it. The interesting question isn't whether it works — it obviously does, for plenty of tasks — but how you decide when the trade is worth making.

Vibe Coding

Andrej Karpathy’s original description of vibe coding was disarmingly honest: you describe what you want, accept whatever the agent produces, and don’t really look at the diffs. For throwaway scripts and weekend prototypes, that’s a perfectly reasonable way to work — the cost of being wrong is low and the cost of careful review is high relative to the stakes. The phrase took off because it named something a lot of people were already doing, and the discourse since has split cleanly into “this is the future of programming” and “this is how you ship a security incident.” Both camps are right about different tasks.

What actually matters is not picking a side, but recognising what changes structurally when you stop reading every line.

What Changes When You Stop Reading Every Line

Traditional code review checks correctness at the level of statements: does this loop terminate, does this condition cover the edge case, is this the right SQL query. When you vibe code, you’re shifting your checking to the level of outcomes: does the feature work when I try it, do the tests pass, does the app still start. That’s not laziness — it’s a legitimate, much older form of verification (it’s how you’d evaluate work from a contractor you couldn’t watch over their shoulder). But it has a specific blind spot: it only catches problems that show up in the outcomes you actually check. A change that passes every test you ran but introduces a subtle authorisation bug, leaks a secret into a log line, or quietly changes behaviour for an input you didn’t try, sails straight through outcome-level review undetected.

This is the entire crux of deciding when vibe coding is appropriate: how completely do your checked outcomes cover the actual risk? When they cover it well — strong test suite, low blast radius, easy rollback — outcome-level checking is not a compromise, it’s just a faster form of correct review. When they don’t, you’ve quietly disabled the part of review that would have caught the problem.

The Loop Underneath “Vibe Coding”

Whether or not you’re watching closely, every agentic coding session runs the same loop. Step through it below — each stage is where a specific category of risk enters, and most vibe-coding failure stories are a story about which stage got skipped.

The agentic coding loop: Prompt, Plan, Act, Observe, Verify, Commit, repeating in a cycle. Prompt Plan Act Observe Verify Commit
1 / 6

Notice that “Verify” sits structurally between “Observe” and “Commit” — it’s a distinct step from watching the test output scroll past. Observing tells you what happened. Verifying tells you whether what happened was actually correct, which sometimes requires checking something nobody thought to write a test for.

A Rubric, Not a Vibe

Deciding whether to vibe code a given task is really four separate questions, and they don’t always point the same way. The tool below computes an actual recommendation from your answers — it’s a fixed scoring rubric, not a model call, so the same answers always produce the same result.

How easy is this change to undo?
Who or what is affected if it's wrong?
How will you actually know if it's wrong?
How well do you know this part of the codebase?

Run through it with a real task you’re considering right now rather than a hypothetical — the questions are deliberately about your specific situation, not coding tasks in the abstract.

Anatomy of a Vibe-Coded Change

Here’s a realistic example of what “accept the diff without reading every line” actually looks like in practice, and where the failure mode hides. The prompt was simple: “add a retry to the payment webhook handler, it’s flaky.” The agent produced this:

- def handle_webhook(payload):
-     result = process_payment(payload)
-     return result
+ def handle_webhook(payload):
+     for attempt in range(3):
+         try:
+             result = process_payment(payload)
+             return result
+         except PaymentGatewayError:
+             if attempt == 2:
+                 raise
+             time.sleep(2 ** attempt)

At outcome level, this looks correct: the flaky call now retries, the tests pass, the diff is small and readable. The bug only shows up if you ask the verification question that the prompt didn’t raise: is process_payment safe to call three times for the same payload? If it isn’t idempotent — if it charges a card or writes a ledger entry each time it runs — this “fix” turns a transient network error into a customer charged three times. Nothing about the diff looks wrong. The risk lives entirely in a property of the function being retried, which is exactly the kind of thing outcome-level review doesn’t check unless you specifically think to.

That’s the whole argument in miniature. Vibe coding isn’t unsafe because agents write bad code — in this example the retry logic itself is fine. It’s that the review step that would normally ask “is this operation safe to repeat?” only fires if a human (or a test) is specifically looking for it, and skipping line-by-line review makes it easy to skip that question too.

The Practical Version

None of this argues against vibe coding — it argues for matching the amount of scrutiny to what your rubric score actually tells you, rather than to how good the diff looks. A few habits make that easier in practice: keep agent-driven changes small enough that “read the whole diff” is actually realistic; run tests yourself rather than trusting a one-line “tests pass” summary; ask the agent to flag operations that aren’t idempotent or aren’t reversible before it acts, not after; and work in a branch or worktree you can throw away without consequence when the blast radius is genuinely unknown. The agent is very good at the Act step. The Verify step is still yours.