The Human-in-the-Loop Paradox

Why I think the industry is betting on the wrong long-term architecture for AI.

TL;DR

For the last two years, “Human in the Loop” (HITL) has become the default answer to almost every AI safety question.

I think it’s the wrong end state.

In fact, I believe the effectiveness of human oversight is inversely proportional to the reliability of the AI it supervises.

As AI becomes better, humans don’t become better reviewers. They become worse ones. Not because they are lazy. Not because they lack discipline. But because that’s how human attention works.

And The future architecture won’t be: AI → Human Reviewer

It will increasingly become: Maker Agent → Checker Agent → Human-on-Exception

Or what I’d call Human-off-the-Loop (HOTL)* — humans who don’t continuously supervise execution, but intervene only when independent systems disagree or encounter something genuinely novel. I didn’t arrive at this conclusion by reading research papers. (Though I did read some summaries). I arrived at it by zooming out of my own behaviour with AI-written-code.

I stopped reading code.

I’ve been using Claude and Cursor to write code for the last eight or nine months.

Like almost everyone who’s used these tools seriously, my development velocity has gone through the roof. Something that earlier took me two weekend sessions of four hours each now gets built in under five minutes.

This is how my current path looks:

I imagine a feature.
Debate the approach with Claude, Gemini, ChatGPT.
Write a BRD.md. Read and refine it.
Work through architecture. Write a tech_architecture.md.
Think through the database schema. Add it to the above.
Break it down into a smaller module or feature and map it to a build_next.md.
Ask Cursor to build it.
Five minutes later it’s done.

Sometimes I don’t even have time to get myself a coffee before it’s asking whether I want to review the changes. And I love the elegant code it churns out.

But over time, I stopped reviewing the code. Completely. Not intentionally. Not overnight.

It happened gradually enough that I didn’t notice it, until one day I realised I hadn’t actually read a diff in weeks.

Cursor would generate code.
Generate and run the tests.
Rework the code if needed.
I’d get Claude to review the code against the same build documents.
It would identify and fix failures, and show me what it felt was a miss and why.
I’d review those and give my consent to the proposed changes.

Somewhere along the way, reviewing code stopped feeling useful. I had to read too much only to find that Cursor and Claude had already thought through edge cases better than I would have as a novice.

It started feeling like validating a calculator.

Every once in a while I’d scroll through the code. It looked beautiful — nice comments, well-structured functions and classes.

But I wasn’t really reviewing it.

So I asked myself: was I becoming lazy? Or was I cognitively overwhelmed by the sheer velocity of my AI coding companion?

Was it my lack of discipline ?

My first explanation was that this was laziness — refusing to be excited about reviewing code.

Was it that I felt more energised writing (creating) than reviewing (testing)?

Maybe I just needed more discipline. Block thirty minutes every evening and force myself to review every generated line. Or maybe I just needed a stronger coffee.

The more I thought about it, the less convinced I became. Because nothing else in my behaviour had changed. I still cared deeply about my coding projects. I still spent hours on architecture decisions, still documented, revisited, and challenged product assumptions.

The only thing I had stopped doing was manually inspecting code that had become consistently excellent. Because of the high quality, volume, and velocity of what was being generated — I simply couldn’t keep up.

That made me wonder whether this wasn’t a Wribhu problem. Maybe it was a human problem.

So I started reading up on it.

Turns out we’ve seen this movie before.

Long before AI coding agents existed, researchers studying aircraft autopilots, industrial control systems, and medical decision-support tools had already documented a remarkably similar phenomenon.

The names that kept coming up: Christopher Wickens, Raja Parasuraman, Dietrich Manzey.

Their conclusion, summarised simply:

Humans stop actively monitoring systems that have repeatedly proven themselves reliable.
And more importantly: training doesn’t eliminate this tendency.

That last line changed how I thought about my own behaviour. I had assumed my declining vigilance was a personal failing. Decades of research suggested it was a predictable consequence of interacting with highly reliable automation.

In one study, researchers examined what happens when an automated system makes its first significant mistake after a long streak of correct decisions. Intuitively, you’d think that’s exactly when the human reviewer catches it.

The opposite happens. The streak of correctness is what causes humans to reduce their attention.

The automation earns trust.
The trust reduces scrutiny.
And scrutiny is lowest precisely when it matters most.

I read that a few times. Because it perfectly described what had happened to me with Cursor and Claude. No one told me to stop reviewing code. The AI simply got good enough that my brain quietly concluded my attention was better spent elsewhere.

My hypothesis around HITL

I think we’ve got Human-in-the-Loop backwards.

Most discussions assume that adding a human reviewer permanently improves system safety. I think that’s only true while the AI is unreliable enough to keep the human engaged — or while volume and velocity are low, or the system is newly implemented.

Beyond a certain threshold, every improvement in AI reliability reduces the amount of genuine human oversight. Not morally. Cognitively.

At 60% accuracy, the human checks everything. At 80%, they skim. At 95%, they spot-check. At 99%, they click Approve.

The human never leaves the workflow. But meaningful review quietly disappears.

That’s the Human-in-the-Loop Paradox.

The effectiveness of human oversight is inversely proportional to the reliability of the AI it supervises.

I don’t have empirical proof of the exact shape of this curve yet — that’s the part I’d genuinely love someone to go test. But the direction of the effect isn’t really in question; it’s the same complacency curve Wickens and Parasuraman documented decades ago, just compressed into a much faster feedback loop.

As AI reliability improves, its contribution to system quality keeps increasing. Human contribution doesn’t — it peaks, then declines, eventually approaching zero, because the human stopped allocating meaningful attention.

This isn’t unique to coding — coding is just the first domain where millions of us are living through it in real time, fast enough to notice. A radiologist who’s seen the AI right on 50,000 scans in a row isn’t inspecting scan 50,001 with the same intensity as the first. Someone clearing ten thousand fraud alerts or AI-drafted customer emails before lunch isn’t either. The human brain isn’t built to maintain perfect vigilance over streams of mostly-correct output, and the higher the throughput, the faster that vigilance erodes.

Volume and velocity also change the shape of the decline, not just its existence. If you’re reviewing ten mergers a year, you’ll likely stay engaged — the decline is gentle. If you’re reviewing ten thousand AI-generated items before lunch, it isn’t gentle, it’s a cliff. Which means the right question isn’t “should there be a human in the loop?” It’s “can a human meaningfully stay engaged at the scale this system actually operates?” Those are very different questions, and the architecture that works for a surgeon won’t work for a claims processor or a coding assistant.

Today, almost every AI workflow is still built around the same loop: generate, hand to human, approve, repeat. If the paradox is real, we’re spending enormous amounts of expensive human attention reviewing outputs that are almost certainly correct — the equivalent of employing thousands of accountants to manually re-verify that calculators still know two plus two is four. Eventually the accountants stop checking. The calculators don’t. Architecture should reflect that, instead of pretending it won’t happen if we just write a stricter review policy.

Agent-Agent<>Human – HOTL

WWithout consciously planning it, my own workflow has already evolved:

Cursor writes code.
Claude reviews it, runs tests, and challenges assumptions.
Claude sometimes points out edge cases Cursor missed.
Sometimes Claude finds something genuinely important.
Sometimes it says Cursor could simplify an implementation.
Sometimes it admits Cursor found a better solution than it had initially considered.
So I don’t spend my energy reading every generated line of code.
I spend my time reading disagreements — the disconnects between Cursor and Claude.

(Yes, I deliberately don’t use Anthropic models inside Cursor — I want two genuinely independent perspectives, not the same model checking its own work.)

The actionable area for me was never the code itself. It’s the disconnect between two different models. That’s where my attention has maximum leverage — and I suspect that’s true far beyond software.

This reframes the human’s job entirely. We’ve defaulted to treating humans as checkers — someone who looks at every transaction. I think the better role is judge — someone who intervenes only when something is contested. A recent systematic review of human-in-the-loop AI design names the failure mode of the checker model directly: humans nominally “in the loop” mainly to absorb accountability when something goes wrong, not to provide real oversight — present, but not actually engaged. That’s the checker model at scale. The judge model sidesteps it structurally: judges don’t need to scale linearly with output volume. They scale with disagreement. And disagreement is a far smaller number than output.

That distinction — checker vs. judge — is the architectural shift I think eventually wins, with the obvious caveat that the maker and checker agents need to be genuinely independent (different models, different biases), or you’ve just built an elaborate way to rubber-stamp yourself.

**Human-off-the-Loop (HOTL*)**

This is where I think the next design pattern emerges.

The name is slightly provocative, but the idea isn’t. Humans don’t disappear. They’re repositioned from checkers to judges.

Instead of continuously supervising execution, they intervene when:

two independent agents disagree,
confidence drops below a threshold,
an anomaly is detected,
or the system encounters something it hasn’t seen before.

The human isn’t reading every line. The human is resolving ambiguity. Every judgement becomes a learning signal back to both agents — the maker improves, the checker improves, the disagreement rate falls, and human effort naturally decreases over time.

Ironically, the system gets safer precisely because the human is involved less often, but with far greater focus when they are.

The architecture I think we’re heading toward

For most of history, knowledge work looked like this: one human creates, another human reviews.

AI first replaced the creator: AI creates, human reviews. I think that’s just an intermediate state.

The long-term architecture is more likely to be: one agent creates, a second independent agent challenges it, and only unresolved disagreements reach a human. The human adjudicates. That judgement improves both systems. The cycle repeats.

That’s a fundamentally different way of thinking about AI governance than “human reviews every output.”

My prediction: HOTL, not HITL, becomes the steady state

Within five years, the dominant enterprise AI architecture will not have humans rWithin five years, I don’t think the dominant enterprise AI architecture will have humans reviewing every output. Humans will review disagreements, low-confidence cases, and exceptions.

The winning systems won’t be the ones with a human checking every output. They’ll be the ones where one machine continuously challenges another, and human judgement is reserved for exactly the moments it adds disproportionate value.

There will always be domains where law, ethics, or regulation require human approval regardless of behavioural realities — I’m not arguing humans disappear. I’m arguing that continuous review becomes behaviourally ineffective at scale, whether or not the policy keeps requiring it.

Which brings me to the line I’d genuinely like to debate:

Humans should not scale with AI output volume. They should scale with AI disagreement volume.

The question isn’t whether humans should stay in the loop. It’s where human attention creates the most marginal value. I increasingly believe that answer is: at points of disagreement, not at points of execution.

References:

Wickens, C. D., Clegg, B. A., Vieane, A. Z., & Sebok, A. L. (2015). Complacency and Automation Bias in the Use of Imperfect Automation. Human Factors. https://journals.sagepub.com/doi/10.1177/0018720815581940 Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: an attentional integration. Human Factors. https://pubmed.ncbi.nlm.nih.gov/21077562 Lazaros, K., Vrahatis, A. G., & Kotsiantis, S. (2026). Human-in-the-Loop Artificial Intelligence: A Systematic Review of Concepts, Methods, and Applications. Entropy. https://www.mdpi.com/1099-4300/28/4/377 — source of the “checker present but not actually overseeing” pattern referenced above.

– Wribhu

PS: Would love to hear your pushback on this.

PPS: Images and part of the post, was created by AI with Human-very-much-in-the-loop !!

The Human-in-the-Loop Paradox

The Human-in-the-Loop Paradox

TL;DR

I stopped reading code.

Was it my lack of discipline ?

Turns out we’ve seen this movie before.

My hypothesis around HITL

Agent-Agent<>Human – HOTL

Human-off-the-Loop (HOTL*)

The architecture I think we’re heading toward

My prediction: HOTL, not HITL, becomes the steady state

Comments

Leave a Reply Cancel reply

**Human-off-the-Loop (HOTL*)**