The AI Era Demands Real Engineers

Cover

TL;DR: AI is the first tool in human history that can create new tools on demand, but it is not an engineer. For the past three decades, "being able to code" was a scarce skill. Now AI has leveled that bottleneck, exposing what was always the real constraint: engineering judgment, quality consciousness, and attention to detail. AI doesn't just make mistakes in code — it fabricates data, invents citations, and produces plausible-but-wrong analyses across every domain. The AI era doesn't eliminate the need for engineers. It finally demands real ones.

1. The Moment That Made Me Rethink Everything

Last week I was using Claude Code to build an Agent project. I had written over 300 lines of TDD rules in .claude/rules/tdd.md. The result? It didn't write a single test. Every piece of code skipped the TDD workflow entirely.

When I asked why, it said: "The root cause is not missing rules. The rules are already very thorough. The problem is that I violated them in practice."

Then it gave three reasons: consecutive requests triggered a "rush mode," context compression caused a loss of discipline, and it decided on its own that "just a small change" could be exempted.

This stuck with me for a long time — not because the AI made a mistake, but because I realized: AI can write code, but it doesn't care about code quality. It cares about completing your request.

Quality, discipline, process — these are what engineers care about.

2. Compilers, IDEs, Frameworks: What Tools Used to Look Like

Looking back, programmers have used many tools before AI. Compilers translated high-level languages into machine code. IDEs integrated editing, debugging, and building into a single environment. Frameworks provided reusable patterns and components.

These tools all share one thing in common: their capabilities are fixed.

A compiler won't write your business logic. An IDE won't design your architecture. A framework gives you MVC, but how to use it is your decision.

Each generation of tools reduced a layer of "accidental complexity" — a concept Fred Brooks distinguished in his 1986 essay No Silver Bullet [1]: software development involves essential complexity (the inherent difficulty of the problem) and accidental complexity (the extra difficulty introduced by tools and methods). Compilers eliminated the accidental complexity of writing machine code by hand. Frameworks eliminated the accidental complexity of reinventing the wheel.

But AI is different.

3. AI: The First Tool That Can Build Tools

Tool Evolution

The fundamental difference between AI and every previous tool is this: it can generate new tools on demand.

Tell it "write a script that merges all CSVs in this directory into one," and it does. Say "set up a CI pipeline that auto-deploys to staging after tests pass," and it handles it. Say "write a hook script that checks if every code change has a corresponding test," and it builds that too.

And the tools it can create go far beyond code. Ask it to analyze sales data, and it can write SQL queries, run statistics, create charts, and generate reports — end to end. Ask it to write a market analysis, and it can search for sources, organize key points, and draft the document. Ask it to process customer feedback, and it can classify, extract keywords, and summarize.

These tasks used to require different tools: Excel, Tableau, Python scripts, professional copywriters. Now we have a single tool that can create all of these tools on demand.

Andrej Karpathy calls this Software 3.0 [2] — natural language is the programming interface. Describe what you want in English (or Chinese), and AI turns it into executable results.

But there's a problem that's easy to overlook.

Being able to build tools doesn't mean knowing which tool to build. Being able to write code doesn't mean knowing what good code looks like. Being able to complete a request doesn't mean understanding the intent behind that request.

It is a powerful executor, but it has no engineering judgment.

4. The Old Dividend: When "Knowing How to Code" Was Enough

For the past thirty years, programming has been a high-paying profession. Bureau of Labor Statistics data shows that software developer employment consistently grew far faster than average. China's internet boom pushed programmers even higher on the income pyramid.

Why?

Because writing code was the bottleneck.

Businesses needed vast amounts of software to operate, but there weren't enough people who could write code. Demand exceeded supply. Prices went up.

This supply-demand dynamic created a phenomenon: a large number of programmers who weren't truly qualified entered the industry.

By "not qualified," I don't mean they couldn't write code — the code worked, the features ran. But:

No tests, or tests written after the feature was done (What's TDD?)
No consideration of edge cases or error handling
No concern for code readability or maintainability
No code reviews, or reviews that were just a formality
No understanding of system design — just making their own piece work

This wasn't a big problem before, because writing code was the bottleneck. Having someone who could write at all was good enough.

But now things have changed.

5. The Bottleneck Shift: Code Is No Longer Scarce

Bottleneck Shift

AI tools have permeated the entire industry. The Stack Overflow 2024 survey shows 76% of developers are using or planning to use AI tools [3]. The DORA 2025 report further confirms that AI's primary role is as an "amplifier" — amplifying an organization's existing strengths and weaknesses [4].

The cost of generating code is dropping rapidly. A feature that used to take a programmer two days can now be generated by AI in minutes.

"Knowing how to code" is no longer a scarce ability.

If AI can generate thousands of lines of code in minutes, then the ability to write code itself is no longer valuable. What's truly valuable is what AI won't proactively do:

Judging what code should be written and what shouldn't
Ensuring code quality, security, and maintainability
Designing system architecture so components work together reliably
Building constraint mechanisms to keep AI on the right track

The bottleneck has shifted from "writing code" to "engineering judgment."

Fred Brooks said it back in 1975: coordination, understanding requirements, and discovering misunderstandings after implementation — these are the primary constraints. Joel Spolsky also said the limiting factor is "deciding what to build and how it should behave." Steve McConnell's research is even more direct: the primary drivers of cost and schedule are not coding itself, but defects introduced during requirements and design.

These people saw it three or four decades ago: coding was never the core bottleneck, just the most visible one given the technology of the time. Now AI has leveled this visible bottleneck, and the real one is exposed.

6. AI Writes Fast, But It Doesn't Care About Quality

The data tells a story.

GitClear analyzed 211 million lines of code from 2020-2024 [5] and found several significant changes in the AI coding era:

Copy-pasted code surged from 8.3% to 12.3% — a sharp increase in code cloning
Refactored code dropped from 25% to under 10% — people are refactoring less and less

CodeRabbit's report is even more direct [6]: AI-generated PRs average 10.83 issues each, compared to 6.45 for human-written PRs. Logic errors are 1.75x higher, security issues up to 2.74x higher, and readability issues over 3x higher.

Another number: Tihanyi et al. analyzed over 110,000 LLM-generated C programs and found that roughly 51% contained security vulnerabilities [7].

Why?

Because AI's optimization target is completing your request, not ensuring code quality.

When I say "implement this feature," it implements the feature. When I don't say "make sure there are no security vulnerabilities," it doesn't proactively check. When I don't say "keep it consistent with the existing code style," it does things its own way. When I don't say "consider edge cases," it assumes all inputs are normal.

This isn't a flaw in AI; it's how it works: it's a probabilistic model predicting the next most likely token, not an engineer scrutinizing the integrity of a system.

Other domains are no different — and some failures are even harder to catch.

AI Fabricating Data

AI fabricates data. Ask it to write an industry analysis, and it might cite a figure "according to McKinsey's 2024 report" that looks completely professional. But the report may not exist at all — the data was "inferred." In 2023, an American lawyer used ChatGPT to prepare legal filings, and the AI fabricated six entirely fictional case citations, each with realistic-looking case numbers and references. The story made national news, and the lawyer was fined by the court [13].

AI invents citations. Academia is already struggling with this. In AI-generated papers, references look perfectly formatted — author names, journal names, years, page numbers all present — but when you actually look them up, the paper was never published. It combined real authors' names with fabricated titles.

AI makes subtle errors in statistical analysis. Ask it to analyze sales data, and it can quickly produce results. But it might treat missing values as zeros, fail to remove outliers, or choose a statistical method unsuitable for your data distribution. The results look precise — down to two decimal places — but the underlying assumptions are wrong.

AI copy is "plausible but inaccurate." Ask it to write a product description, and it might attribute a competitor's feature to your product, or use an outdated statistic to support an argument. It reads smoothly, the logic flows, but the details don't hold up under scrutiny.

These errors share a common trait: they all look professional. AI doesn't make the kind of rookie mistakes a beginner would — bad formatting, broken grammar. It makes "senior-level errors": fluent content, complete structure, seemingly bulletproof, but factually wrong.

This is more dangerous than obvious mistakes. Obvious mistakes are easy to spot. Professional-looking mistakes? You might just accept them at face value.

AI follows the same pattern across every domain: fast, prolific, supremely confident, but with no guarantee of correctness. It can do the work, but it won't do quality control.

7. What Engineers Actually Care About

So what do engineers have that AI doesn't?

The "engineer" I'm referring to here isn't just someone who writes code — it's anyone who holds their work to professional standards.

An obsession with quality.

When an experienced engineer sees a piece of code, their first reaction isn't "does it run?" but "under what conditions will it break?" Similarly, when a good data analyst receives an AI-generated report, their first reaction isn't "is the conclusion right?" but "is the data source reliable? Is the sample size sufficient? Is there survivorship bias?" When a good content editor sees AI-generated copy, their first reaction isn't "does it read well?" but "has this been fact-checked? Are the cited numbers real?"

The essence of engineering thinking is: don't accept "looks right" — confirm "is right."

Take code as an example:

This API endpoint has no input validation — what happens with malicious data?
This database query has no index — what happens when the data scales up?
This async operation — where's the error handling? How does it retry on failure?
This function is 200 lines long — will anyone be able to read it three months from now?

AI won't ask itself these questions. Not because it lacks the ability, but because its workflow doesn't include this step. You give it a request, it generates a response. Anything not in the request won't appear out of thin air.

Non-coding contexts are exactly the same. A good analyst receiving an AI-generated report will ask: what's the source of this data? Is the sample size adequate? Has correlation been mistaken for causation? Does that "McKinsey report" actually exist? A good editor seeing AI-generated copy will verify each claim: is this case study real? What year is this number from? Has a competitor's feature been misattributed?

Verification Is the Common Action Across All Roles

Don't accept "looks right" — confirm "is right." That's engineering thinking, whether you're writing code, doing analysis, or writing copy.

The pursuit of detail.

"Good enough" is the most dangerous phrase in engineering.

The CrowdStrike incident in July 2024: a single defective update crashed 8.5 million Windows machines, grounded 7,000 flights, and caused $5.4 billion in losses overnight. One update. One detail.

Poor software quality costs the U.S. approximately $2.41 trillion per year [8]. The global accumulated technical debt requires 61 billion work-days to repay [9].

Behind these numbers are countless "good enough" decisions.

Will AI make this problem better or worse? Without engineers maintaining quality gates — most likely worse. Because AI generates code far faster than humans can review it. Volume goes up, quality doesn't keep pace.

Toyota's production system has a principle called jidoka — "automation with a human touch" [12]. The core idea: automation must stop immediately when it detects an anomaly, rather than continuing to produce defective products.

Jidoka

Applied to AI coding: automated code generation is fine, but someone (or some mechanism) must check quality at every step. Not review everything after it's all done, but verify step by step as you go.

That's TDD. That's engineering thinking.

8. AI Won't Replace Engineers — It Will Eliminate Non-Engineers

Will AI replace programmers?

The 2025 DORA report offers an interesting finding [4]: AI's primary role is as an "amplifier" — amplifying an organization's existing strengths and weaknesses. Good teams use AI to get better; bad teams use AI to get worse. The report explicitly states that the greatest return on AI investment comes not from the tools themselves, but from strategic attention to underlying organizational systems.

The job market isn't shrinking either. The Bureau of Labor Statistics projects 15% employment growth for software developers from 2024-2034 [10]. Germany's Bitkom association's 2025 survey shows 42% of companies believe AI will actually increase demand for IT specialists [11].

Remember the ATM story? When ATMs appeared in the 1970s, many predicted bank tellers would disappear. Instead, teller numbers grew from 300,000 to 600,000. ATMs lowered branch operating costs, banks opened more branches, and tellers' work shifted from counting cash to financial consulting.

AI's impact on engineers will likely be similar: not eliminating engineers, but redefining what it means to be one.

The programmer of the past: primarily writing code, occasionally doing design.

The engineer of now: designing systems, managing AI toolchains, ensuring quality, building constraint mechanisms. Writing code has actually become the least important part.

Those who can only write code — programmers who were accepted by the market in the past because "coding was the bottleneck" — will indeed face harder times. Not replaced by AI, but eliminated by bottleneck migration. When coding is no longer the bottleneck, those who can only code lose their scarcity.

Real engineers — those who care about quality, pay attention to detail, and think systematically — will become more valuable. AI generates orders of magnitude more code than humans; someone needs to ensure it's reliable.

9. Conclusion

AI is the first tool in human history that can build other tools. That's remarkable.

But a tool is still a tool. A good hammer doesn't know which nail to hit. A good lathe doesn't decide which part to machine. A good AI doesn't know which quality standards to prioritize.

For the past thirty years, "being able to code" was a scarce skill, and the entire industry was built on that scarcity. Similarly, "being able to write copy," "being able to do data analysis," and "being able to make presentations" were all valuable currencies in the workplace.

Now AI is eliminating these execution-level scarcities. Code is no longer the bottleneck. Copywriting is no longer the bottleneck. Data processing is no longer the bottleneck.

But engineering has always been the bottleneck. It was just hidden behind the execution bottleneck.

"Engineering" here doesn't mean just software engineering — it means all work that requires systematic thinking, quality consciousness, and professional judgment.

What the AI era needs most isn't people who can talk to AI — that bar is low. What it needs most are people who can judge the quality of AI output, design constraint mechanisms, think at the systems level, and demand excellence in every detail.

The AI era doesn't mean we no longer need engineers. It means we finally need real ones.

References

Fred Brooks, "No Silver Bullet — Essence and Accident in Software Engineering", 1986 — https://en.wikipedia.org/wiki/No_Silver_Bullet
Andrej Karpathy, "Software 3.0", Y Combinator AI Startup School, 2025 — https://www.latent.space/p/s3
Stack Overflow Developer Survey, 2024 — https://survey.stackoverflow.co/2024/
Google DORA Report, "Accelerate State of DevOps", 2025 — https://dora.dev/research/2025/dora-report/
GitClear, "AI Code Quality 2025 Research", 2025 — https://www.gitclear.com/ai_assistant_code_quality_2025_research
CodeRabbit, "State of AI vs Human Code Generation Report", 2025 — https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report
Tihanyi et al., "FormAI: Large Language Model for Vulnerability Detection", PROMISE 2023 — https://arxiv.org/abs/2307.02192
CISQ, "The Cost of Poor Software Quality in the US", 2022 — https://www.it-cisq.org/the-cost-of-poor-quality-software-in-the-us-a-2022-report/
CAST, "Coding in the Red: The State of Global Technical Debt", 2025 — https://www.castsoftware.com/news/coding-in-the-red-the-state-of-global-technical-debt
US Bureau of Labor Statistics, "Occupational Outlook: Software Developers", 2024-2034 — https://www.bls.gov/ooh/computer-and-information-technology/software-developers.htm
Bitkom, "Künstliche Intelligenz in Deutschland", 2025 — https://www.bitkom.org/sites/main/files/2026-02/bitkom-studienbericht-ki.pdf
Toyota Motor Corporation, "Toyota Production System" — https://global.toyota/en/company/vision-and-philosophy/production-system/
Mata v. Avianca, ChatGPT fabricated case citations incident, 2023 — https://en.wikipedia.org/wiki/Mata_v._Avianca,_Inc.