ai reality check

Cognitive Debt, The Crisis Nobody Sees Coming

Imran GardeziJanuary 4, 202611 min read

Everyone's celebrating AI coding speed. Nobody's asking what we're losing.

A client called me recently. Production was down. Their authentication system was broken. Users couldn't log in.

The code was clean. Well-structured. Tests had been passing for months.

But when the team tried to fix it, they couldn't. Not because the code was bad. Because nobody on the team understood how it worked.

The AI wrote it. Six months earlier. A PKCE authentication flow. Token generation, code challenges, redirect handling. The developer who prompted it had moved on. Nobody else had ever read it line by line. The module had shipped, passed QA, and quietly become a black box sitting at the core of their product. When the auth server pushed a breaking change, the team was staring at code that might as well have been written in a foreign language.

"A team of engineers. Staring at their own codebase. And they can't explain their own authentication system."

That's not technical debt. That's something new. Something worse. And almost nobody is talking about it yet.

I've been in this industry for 15 years. Shopify. Brex. Motorola. Pfizer. I've managed engineering teams. I've seen every form of technical debt there is. I've never seen anything like this. Technical debt at least announces itself through slow builds, flaky tests, and mounting frustration. This new problem hides behind green checkmarks and clean diffs.

It has a name now. It's called cognitive debt. And today I'm going to show you what it is, why it's more dangerous than technical debt, and the three practices that prevent it.

What Cognitive Debt Actually Is

Margaret Storey, computer science professor at the University of Victoria, introduced this concept in early 2026. She's been studying developer productivity for over two decades. When she named cognitive debt, it spread through the engineering community in days. Simon Willison, one of the most respected voices in AI development, amplified it within a week.

The moment I read it, everything clicked.

Here's the definition. Cognitive debt accumulates when AI writes code that humans lose shared understanding of.

Not bad code. Not messy code. The code can be perfect. Beautiful, even. But if nobody on the team can explain what it does, how it works, or why it was designed that way, you have cognitive debt. It's the gap between what your codebase does and what your team collectively knows about it. And unlike a messy function or an outdated library, you can't detect it with a linter or a static analysis tool.

"The most dangerous code in your codebase is the code nobody understands. Even if it works."

Now you might be thinking, knowledge silos aren't new. They've existed since the first senior engineer left a company. And you'd be right. What's new is the scale. AI generates code at a rate that makes this systemic, not occasional. It's not one person who left. It's every module your team prompted into existence. Before AI, losing shared understanding meant a key person departed and took context with them. Now you can accumulate the same knowledge gap without anyone leaving the building. The AI was never "on the team" in the first place, so the knowledge never existed inside anyone's head to begin with.

Here's what makes it insidious. Technical debt shows up in your codebase. You can see it. Your linter flags it. Your CI pipeline catches it.

Cognitive debt is invisible. It lives in your team's heads. Or rather, it lives in the gap where understanding should be.

Storey calls it a "silent loss of shared theory." Shared theory is the collective understanding your team has about how the system works, why decisions were made, what the boundaries are. Every healthy engineering team builds this organically through code review, pair programming, architecture discussions, and debugging sessions. It's the reason a senior engineer can glance at an error log and immediately know which service is misbehaving. That intuition doesn't come from documentation. It comes from having built and maintained the system with their own hands and minds.

And the data backs her up. GitHub's own research shows developers using Copilot write code 55% faster. But the study didn't measure whether anyone could maintain that code six months later. Speed was measured. Understanding wasn't.

Let me go back to that PKCE auth flow. Six months after the AI wrote it, the auth server pushed a breaking change. The on-call engineer opened the code. Couldn't follow the flow. Couldn't trace the error handling. They could read the syntax, sure. But they couldn't reason about why the token refresh happened in that specific order, or what would break if they changed the redirect logic, or which edge cases the original implementation was guarding against.

I spent two days. Not fixing the bug. Explaining the system to the team so they could fix it themselves.

"That's cognitive debt. The interest payment is time spent re-learning your own system."

The Velocity Illusion

Here's where it gets interesting.

Everyone's measuring AI adoption by velocity. "We shipped 40% more features this quarter." "Our cycle time dropped by half." The dashboards look great. Executives are thrilled. OKRs are green across the board.

But those numbers are hiding something.

When nobody understands the code, every modification requires going back to the AI. "Hey Claude, explain this function." "Hey Copilot, what does this service do?" Each interaction starts fresh. The AI doesn't remember your last conversation, doesn't know what you tried yesterday, doesn't understand the business context that makes this particular edge case critical.

"If your team needs AI to understand your own codebase, you don't have a tool. You have a dependency."

Every team depends on tools. I get it. You depend on your IDE. You depend on documentation. You depend on Stack Overflow. Here's the difference. Your IDE doesn't forget your project between sessions. Stack Overflow doesn't lose context. AI does. Every conversation starts from scratch. Your codebase doesn't.

Go run this experiment. That "40% velocity increase"? Measure incident resolution time. I've seen teams with heavy AI code generation where incidents take 3-4x longer to resolve on AI-written modules versus human-written ones. The human-written modules have an engineer who understands the intent, the edge cases, the failure modes. The AI-written modules have a blank stare and a prompt history that nobody saved. That's the hidden cost nobody puts on the dashboard.

Stack Overflow's latest developer survey shows only 29% of developers say they trust AI-generated code. Down 11 points from 2024. Developers are seeing this. They didn't have a name for it.

Now they do.

And let's talk about onboarding. New engineer joins. Day one. They're supposed to ramp up on the codebase.

Old world: they read PRs. They see the history. They see the discussions in code review. Knowledge is embedded in the process. They can trace decisions back to conversations, understand trade-offs, learn the team's reasoning.

New world: they open a file. The AI wrote it. There's no PR discussion because the review was "looks good, tests pass." There's no author to ask because the AI doesn't have office hours. There's no commit message explaining the trade-off because there was no trade-off. The AI just generated the most statistically probable implementation.

"The new engineer is reverse-engineering their own team's system. Like reading a book in a language nobody speaks."

What Shopify Taught Me

At Shopify, code review wasn't about catching bugs.

It was the primary mechanism for knowledge transfer.

Junior engineers learned architecture by reading senior PRs. Seniors maintained context by reviewing everything. Code review was the learning loop. It was how shared theory was built and maintained across the entire team. When I was there, you could pull any engineer into a war room and they could reason about systems they hadn't written. Not because they'd memorized the code, but because they'd reviewed the PRs, asked questions during review, and absorbed the architectural intent over months of participation.

AI bypasses that entire loop.

The code shows up. It works. It gets merged. Nobody learned anything. Nobody built context. Nobody can debug it at 3am without re-prompting the AI. The learning loop that used to happen naturally as a side effect of the development process has been short-circuited. Teams that used to build understanding as they built features are now building features without building understanding.

"Code review was never about catching bugs. It was about building shared understanding. AI skipped the most important step."

Let me put this in perspective. Technical debt is like financial debt. You're paying interest whether you see the invoice or not. Every slow feature, every recurring bug, every developer who quits because the codebase is a nightmare. That's interest.

Cognitive debt is worse. With financial debt, at least you know you owe money. With cognitive debt, you don't even know the debt exists. There's no dashboard for "team understanding." Your velocity metrics look great right up until the moment they collapse. It's the engineering equivalent of a company that looks profitable on the income statement but is hemorrhaging cash. The numbers tell one story while reality tells another.

I've rebuilt 12 disaster projects. Every single one had code nobody understood. Not because the developers were bad. Because nobody treated understanding as a requirement. Understanding was treated as a nice-to-have, something that would happen eventually, something that could be deferred. But understanding doesn't compound like interest on savings. It decays. And the longer you wait, the more expensive it gets to rebuild.

"Technical debt is a slow leak. Cognitive debt is a time bomb."

A slow leak, you can see. You mop it up. You fix the pipe. A time bomb? You don't know it's there. Not until it goes off. And by then, you're not debugging code. You're rebuilding trust in the system.

The Three Practices

Here's the principle.

"Speed without understanding is not velocity. It's luck."

Your team shipping fast with AI is only real if they can debug, extend, and refactor that code without going back to the AI. If they can't, the speed is borrowed. And borrowed speed always comes due. The question isn't whether your team will face a cognitive debt reckoning. It's whether they'll face it during a minor feature addition or during a critical production incident at 3am.

Three practices. None of them slow you down. All of them prevent cognitive debt.

Practice one: Review like a junior.

When AI generates code, don't review it like a senior checking for bugs. Review it like a junior trying to learn. This is a fundamental shift in mindset. Most senior engineers review code by scanning for obvious errors, checking edge cases, and verifying test coverage. That works when a human wrote the code, because the author already understands it. When AI wrote the code, the "author" has no understanding at all. The reviewer is the first and possibly last human who will ever deeply engage with this logic.

Three questions. Can I explain every line to a teammate? Do I understand why this approach was chosen? Could I modify this code confidently without re-prompting?

If the answer to any of those is no, you don't merge it. You sit with it. You understand it. Or you rewrite the parts you don't.

This doesn't mean you spend a week studying every function. It means you spend 10-15 minutes asking "could I debug this at 3am?" If yes, ship it. If no, take the time now. Compare that to the six-hour incident when nobody understands the code. The math is obvious.

"Don't review AI code for correctness. Review it for understanding. If you can't explain it, you don't own it."

Practice two: Explain to ship.

Before any AI-generated code ships, the author writes a one-paragraph explanation. What it does. Why it was designed this way. What the known edge cases are.

Not a comment in the code. Not a commit message. A decision record.

Something like: "This service handles token refresh using PKCE. We chose PKCE over implicit flow because our app runs on mobile where client secrets can't be stored securely. The main edge case to watch for is Safari private browsing, which doesn't persist localStorage between sessions, so we fall back to sessionStorage." Five minutes to write. Potentially days of debugging saved.

Now you might be thinking, we tried documentation before and nobody kept it up. Here's the difference. You're writing one paragraph at the moment you understand the code. Not retroactively. Not as a separate task. The moment you merge, you write it. Make it part of the PR template. Automate the reminder. The key insight is that documentation written in the moment of understanding is a completely different activity from documentation written after the fact. One captures fresh knowledge. The other attempts to reconstruct faded memory.

"Five minutes of documentation today saves two days of archaeology tomorrow."

Those first two practices catch 80% of cognitive debt. But the third one is the one that makes your team bulletproof.

Practice three: Rotate the context.

Regularly rotate engineers through AI-generated modules. Not to rebuild them. To review them, understand them, and update the decision records. This is the practice that turns individual understanding into team understanding.

This isn't about making everyone an expert on everything. It's about baseline understanding of critical paths. Start with your authentication, payments, and data pipeline. The systems where a failure at 3am means revenue loss. Then expand outward. The goal is that no critical system in your product has only one person (or zero people) who can reason about it.

At Brex, we handled billions in transactions. We rotated on-call across every service. Not because we expected everyone to be an expert. Because we wanted baseline understanding. When something broke, nobody was starting from scratch. They might not know every line of code, but they could trace the flow, identify the failure point, and reason about the fix. That baseline understanding was the difference between a 30-minute incident and a 6-hour one.

Every engineer should be able to explain any critical path in the system. Not write it from memory. Explain it. That's the bar.

The Teams That Win

Here's what happens when you do this.

One engineering team I advised started these three practices six months ago. Mean time to resolution dropped from 4 hours to 45 minutes. Same codebase. Same team. The only difference: they understood it. The code hadn't changed. The architecture hadn't changed. What changed was that every engineer could reason about the systems they were responsible for, because they'd been forced to engage with the code deeply rather than skim it.

Onboarding dropped from months to weeks. New engineers had decision records, not just code. They understood the why, not just the what. Instead of reverse-engineering intent from implementation details, they could read a paragraph that explained the reasoning, then dive into the code with context already in place.

The team stopped being dependent on AI for their own system. They use AI to write code faster, and they understand what was written. That's the difference between a tool and a crutch. A tool augments your capability. A crutch replaces capability you no longer have. The teams that win with AI are the ones that use it as a tool while maintaining the ability to operate without it.

"The fastest team isn't the one that writes the most code. It's the one that understands all of it."

The Close

Go to your codebase. Find the last five PRs where AI generated the majority of the code. Ask each engineer: "Explain this code to me. Without opening the file. Without re-prompting. From memory."

Three questions. What does it do? Why was it built this way? What breaks if you change line 47? If they can answer all three, no cognitive debt. Keep shipping. If they stumble on any one, you have a number. And now you know where to start.

Review like a junior. Explain to ship. Rotate the context.

Margaret Storey named this problem in early 2026. Most engineering teams haven't heard of it yet. The teams that fix this now, before it compounds, are the teams that win the next five years of AI-assisted development.

The teams that ignore it will find out when something breaks and nobody can fix it.

"Cognitive debt is invisible. Until it isn't."

These principles aren't new. They're engineering fundamentals that most teams forgot when AI made the old process feel slow. Code review, documentation, knowledge sharing. None of these are revolutionary ideas. What's revolutionary is that AI has made them optional for the first time in the history of software development. And teams are discovering that "optional" and "unnecessary" are very different things.

"Every team that succeeds with AI will look back and realize: the speed wasn't the advantage. The understanding was."

If you're building something with AI and you want a team that operates with discipline, not just speed, that's what we do at Modh.

Fix it now. Or explain it later.

Key Takeaways

Cognitive debt is fundamentally different from technical debt. The code itself can be clean, well-tested, and perfectly structured, but if nobody on the team can explain how it works or why it was designed that way, you're accumulating a hidden liability that no linter or CI pipeline will ever detect.
AI bypasses the natural learning loops that engineering teams have relied on for decades. Code review, pair programming, and debugging sessions used to build shared understanding as a side effect of the development process. When AI generates the code, those learning moments disappear, and teams build features without building the knowledge to maintain them.
The velocity gains from AI coding tools are often illusory when you factor in maintenance costs. Teams report shipping 40% more features, but incident resolution on AI-written modules can take 3-4x longer because nobody understands the code well enough to debug it under pressure. The speed is borrowed, and the interest compounds every month.
Three concrete practices prevent cognitive debt without slowing your team down. Review AI code like a junior trying to learn (not a senior scanning for bugs), require a one-paragraph decision record before any AI-generated code merges, and regularly rotate engineers through critical AI-written modules to build baseline understanding across the team.
Developer trust in AI-generated code is already declining, with only 29% of developers reporting trust (down 11 points from the prior year). The teams that implement cognitive debt prevention now, before the problem compounds, will have a structural advantage over teams that discover the issue during a 3am production incident when nobody can explain their own authentication system.

Frequently Asked Questions

What is cognitive debt and how is it different from technical debt?

Cognitive debt is the gap between what your codebase does and what your team collectively understands about it. Unlike technical debt, which manifests as messy code, slow builds, or flaky tests that tools can detect, cognitive debt is invisible. It accumulates when AI generates code that works perfectly but nobody on the team can explain, debug, or confidently modify. The distinction matters because technical debt gives you warning signs, while cognitive debt hides behind green checkmarks until a production incident forces your team to confront code they can't reason about.

How do I know if my engineering team has cognitive debt?

The simplest diagnostic is what Margaret Storey calls the "explain it cold" test. Pull up the last five AI-generated PRs and ask the engineers who submitted them to explain the code without opening the file or re-prompting the AI. If they can walk through the logic, explain the design decisions, and identify what would break if you changed key lines, you're in good shape. If they stumble, hesitate, or need to re-read the AI's comments to remember what the code does, that's cognitive debt. Another signal is incident resolution time: if bugs in AI-written modules consistently take 3-4x longer to resolve than bugs in human-written modules, your team has a comprehension gap.

Can you still use AI coding tools without accumulating cognitive debt?

Absolutely. AI coding tools are force multipliers when paired with engineering discipline. The key is treating AI-generated code like code from a talented contractor who has never seen your codebase. You wouldn't merge a contractor's PR without understanding it, and you shouldn't merge AI-generated code without understanding it either. The three practices that prevent cognitive debt are reviewing AI code for understanding (not just correctness), writing a brief decision record before merging, and rotating engineers through AI-written modules. None of these slow you down meaningfully, and they preserve the shared understanding that makes your team capable of maintaining what they ship.

Why are velocity metrics misleading when teams adopt AI coding tools?

Velocity metrics like features shipped, cycle time, and PR throughput only measure output speed. They don't measure whether the team can maintain, debug, or extend what they've built. A team using AI tools might ship 40% more features in a quarter, but if incident resolution takes 3-4x longer on those AI-written modules, the net productivity gain evaporates. GitHub's own research showed that Copilot users write code 55% faster, but the study never measured whether anyone could maintain that code six months later. The real metric that matters is sustainable throughput: features shipped that don't create future incidents, onboarding friction, or maintenance bottlenecks.

How do you prevent cognitive debt on a team that's already heavily using AI for code generation?

Start with your critical paths: authentication, payments, data pipelines, anything where a failure at 3am means revenue loss. Audit those modules first. Have the engineers who work on them explain the code without AI assistance. Where understanding gaps exist, schedule focused review sessions where the team walks through the logic together and writes decision records. Then implement the three practices going forward: review like a junior (ask "can I explain every line?"), explain to ship (one-paragraph decision record per AI-generated PR), and rotate the context (regularly assign engineers to review modules they didn't write). The initial audit takes a week or two. The ongoing practices add roughly 10-15 minutes per PR, which is trivial compared to the multi-hour incidents they prevent.