How did Cloudflare rebuild Next.js so fast with AI?

Next.js has thousands of open-source end-to-end tests. Claude Code used that test suite as automated validation criteria: generate code, run the tests, fix what fails, repeat at machine speed. The test suite did the work that human code review normally does. Without it, the rebuild would not have been possible.

Is AI-generated production code actually safe to ship?

It depends entirely on your validation layer. Cloudflare shipped code no human reviewed, and Vercel found critical vulnerabilities within 24 hours. The risk is not that AI-generated code is uniquely dangerous. The risk is skipping rigorous automated testing. If your criteria are solid, the output can work. If your criteria are weak, you are generating plausible-looking garbage at scale.

Does this mean software engineers are being replaced?

No. Cloudflare proved that a skilled engineer with AI tools can do in one week what would have taken a team years. The engineer directing V-Next had deep Next.js internals knowledge. Without that judgment, the AI goes nowhere useful. Domain expertise becomes more valuable, not less. The ratio shifts: less coding, more thinking.

What does this mean for platform lock-in and vendor strategy?

Platform lock-in just got weaker. Next.js was effectively tuned for Vercel's infrastructure. Cloudflare and Netlify spent years trying to fix that through OpenNext and could not keep pace. V-Next solved it in a week. If a competitor can rebuild your compatibility layer that fast, the switching cost you thought protected your business is thinner than you assumed.

Why would open-source projects hide their test suites?

Because the test suite is what made Next.js cloneable in a week. The engineer who led V-Next predicted publicly that projects would start making their tests private, like SQLite already does. If your test suite gives competitors a ready-made validation framework to clone your project with AI, locking it down is a rational defensive move. The tradeoff is that it also hurts community contribution.

What should a business leader actually do in response to this?

Three things. First, audit your test coverage because it is now a strategic asset for both offense and defense. Second, invest in domain expertise, since human judgment is what steers AI agents toward useful outcomes. Third, rethink your platform assumptions, because build-vs-buy math and switching cost assumptions just shifted much for many software categories.

How did Cloudflare rebuild Next.js so fast with AI?

Next.js has thousands of open-source end-to-end tests. Claude Code used that test suite as automated validation criteria: generate code, run the tests, fix what fails, repeat at machine speed. The test suite did the work that human code review normally does. Without it, the rebuild would not have been possible.

Is AI-generated production code actually safe to ship?

It depends entirely on your validation layer. Cloudflare shipped code no human reviewed, and Vercel found critical vulnerabilities within 24 hours. The risk is not that AI-generated code is uniquely dangerous. The risk is skipping rigorous automated testing. If your criteria are solid, the output can work. If your criteria are weak, you are generating plausible-looking garbage at scale.

Does this mean software engineers are being replaced?

No. Cloudflare proved that a skilled engineer with AI tools can do in one week what would have taken a team years. The engineer directing V-Next had deep Next.js internals knowledge. Without that judgment, the AI goes nowhere useful. Domain expertise becomes more valuable, not less. The ratio shifts: less coding, more thinking.

What does this mean for platform lock-in and vendor strategy?

Platform lock-in just got weaker. Next.js was effectively tuned for Vercel's infrastructure. Cloudflare and Netlify spent years trying to fix that through OpenNext and could not keep pace. V-Next solved it in a week. If a competitor can rebuild your compatibility layer that fast, the switching cost you thought protected your business is thinner than you assumed.

Why would open-source projects hide their test suites?

Because the test suite is what made Next.js cloneable in a week. The engineer who led V-Next predicted publicly that projects would start making their tests private, like SQLite already does. If your test suite gives competitors a ready-made validation framework to clone your project with AI, locking it down is a rational defensive move. The tradeoff is that it also hurts community contribution.

What should a business leader actually do in response to this?

Three things. First, audit your test coverage because it is now a strategic asset for both offense and defense. Second, invest in domain expertise, since human judgment is what steers AI agents toward useful outcomes. Third, rethink your platform assumptions, because build-vs-buy math and switching cost assumptions just shifted much for many software categories.

AI Rebuilt Next.js in a Week for $1,100

Josef Holm9 min readMarch 29, 2026

Key Takeaways

Cloudflare rebuilt Next.js in about a week for $1,100 using Claude Code, and it is already running in production on government websites.
The rebuild only worked because Next.js has a thorough open-source test suite. The AI iterated against automated validation criteria, not human intuition.
Domain expertise still matters: the engineer directing the AI had deep Next.js internals knowledge. The AI did the coding; the human did the judgment.
Your test suite is now a strategic asset in two directions. It lets you build faster with AI, and it lets competitors rebuild your product faster against you.
The question for business leaders is not whether to use AI for software development. It is whether you have the validation infrastructure and domain expertise to use it well.

Most software won't be written. It'll be grown.

That's the real takeaway from Cloudflare's decision to rebuild Next.js using Claude Code in about a week for $1,100 in API tokens. Not the competitive drama between Cloudflare and Vercel. Not the security vulnerabilities found within 24 hours. Not even the fact that it claims to run four times faster than the original.

The real story is what this tells us about how complex software gets built from here.

What actually happened?

Cloudflare engineer Steve Faulner led a project called V-Next, a ground-up rebuild of Next.js on top of Vite. The entire thing was generated by Claude Code, Anthropic's agentic coding tool. No human reviewed the generated code. The team didn't just admit this, they bragged about it.

It's already running in production. Including on government websites.

Cloudflare's CTO Dane Knecht announced the release on February 24, 2026, calling it "Next.js Liberation Day." That framing tells you everything about how Cloudflare sees this move. Not as an experiment. As a strategic weapon.

The interesting part isn't the drama. It's the method.

Why was this even possible?

Here's what most people miss when they hear "AI rebuilt a major framework in a week." They assume the AI just knew how to do it. That's not what happened.

Next.js has an extensive, open-source test suite. Thousands of end-to-end tests covering every feature and edge case. That test suite is what made this possible.

LLMs don't inherently know if their output is correct. What they're good at is searching a probabilistic solution space when you give them automated validation criteria. The test suite served exactly that purpose. Generate code, run it against the tests, see what failed, adjust, iterate. Over and over at machine speed until everything passed.

This is the same pattern behind OpenAI's claimed physics breakthroughs. The model isn't reasoning from first principles. It's generating candidates and checking them against known criteria. When the criteria are clear and automated, this works shockingly well.

A few other things made V-Next feasible. Next.js has extensive public documentation. Years of Stack Overflow answers and tutorials exist in the training data. And Faulner had deep knowledge of Next.js internals, which let him steer the AI agent effectively.

That last point matters more than most coverage acknowledged. This wasn't some random developer pointing Claude at a repo and saying "rebuild this." Domain expertise was required to direct the process. The AI did the coding. A human who deeply understood the problem space did the thinking.

One finding from the rebuild is worth sitting with: the original Next.js codebase was large partly because of abstraction layers that humans added to help themselves reason about complex problems. The AI didn't need those layers. It generated more direct solutions, producing a much smaller codebase. Humans build scaffolding to think. AI doesn't need scaffolding to execute.

Is this the future or is this reckless?

Both. That's what makes it worth paying attention to.

The case against V-Next is straightforward. Cloudflare shipped code they don't fully understand. They can't provide traditional security guarantees. The code has limited open-source contribution value because no human can meaningfully read and maintain it in the conventional sense. Within a day of the announcement, Vercel's CEO Guillermo Rauch disclosed that his team had found two critical, two high, two medium, and one low severity vulnerabilities in V-Next. Vercel even noted that Cloudflare's own bug bounty program would pay them for the discoveries.

That's a real critique. Shipping production code to government websites that you can't fully audit should make people uncomfortable.

The case in favor is equally real. Next.js had effectively locked many companies and developers into Vercel's hosting platform. The framework technically runs anywhere, but it's heavily tuned for Vercel's infrastructure. Cloudflare and Netlify had been co-sponsoring a project called OpenNext for years, trying to make Next.js portable. It was a constant uphill battle because Next.js updates faster than third parties can keep up.

V-Next solves that problem in a way that would have been practically impossible using traditional development methods. It creates genuine platform choice where there was effectively a soft lock-in.

So which framing is right? Irresponsible, or freedom-growing?

I think it's both. And that's the pattern we need to get comfortable with.

What does the Vercel CEO's own words tell us?

This is the part that's almost too perfect.

Guillermo Rauch has been one of the most vocal advocates for AI's impact on software development. Some of his public statements from the past year or so:

"All software will be generative and generated. Adjust because of this."
"Software engineering will be completely unrecognizable in 5 years, likely less."
"There are no limits anymore. Anyone can do anything. The only limiting factors are agency and ambition."

He even proposed a "Turing test" for AI that included challenges like rewriting the TypeScript compiler in Rust with better performance. That's structurally very similar to what Cloudflare just did to his own product.

When you champion a revolution, you don't get to choose who it disrupts. Rauch spent years arguing that AI would rewrite how software gets built. Then someone used AI to rewrite his software. The logic was always going to arrive at his door. It just arrived faster than he expected.

There's a lesson here that goes well beyond Cloudflare and Vercel. If you're building a business on the assumption that AI changes everything, you need to be honest about what "everything" includes. It includes your own moat.

Will open-source projects start hiding their tests?

Here's a detail most coverage of this story missed entirely.

One week before Liberation Day, on February 18th, Faulner tweeted a prediction: "This year we'll see some open source projects make their test suites private, like SQLite."

He was almost certainly mid-project when he posted that. He knew the open-source test suite was the exact thing making his rebuild possible. And he was predicting, accurately I think, that other projects would learn the lesson and lock their tests down.

SQLite already does this. The source code is open. The test suite is proprietary. That's a pre-AI decision that now looks prescient.

What does open source actually mean if the tests are private? Community maintainability drops dramatically. If the whole point of open source is that others can contribute, verify, and build on your work, private tests undercut that purpose. But if the test suite is what makes your project trivially cloneable by AI, the incentive to lock it down is strong.

There's no clean answer. Every open-source maintainer should be asking: what are the strategic assets in my project, and what happens when AI can use them against me? That's a new question. It needs new thinking.

What's the real pattern here?

Let me step back from the Cloudflare/Vercel drama and talk about what this actually means for how businesses should think about software.

When you have clear, automated validation criteria and a well-documented problem domain, AI can generate working solutions at a speed and cost that makes traditional development look absurd. One week. $1,100. Four times faster output.

But the generated code is effectively a black box. No human reviewed it. No human fully understands it. It passes the tests, but the tests are only as good as the tests.

This maps to something much bigger than one framework rebuild.

Think about how biological evolution works. Nature doesn't "understand" how to build a hand or a brain. It tries variations and keeps what passes the test of natural selection. The result is incomprehensibly complex, layered, messy code that even so produces extraordinary capability. Every living organism is, in a meaningful sense, vibe coded. No designer. No architect. Just selection pressure and iteration.

The argument some are making, and I think there's real weight to it, is that the only way humans build increasingly complex software from here is to stop writing it by hand. Establish the automated test criteria. Let AI search for solutions through mutation and iteration. The results are complex in ways that would be impossible to engineer manually. But they work.

Someone coined the term "synthetic software" for this. Code that no human has ever seen inside of. Treated as a black box that either fits its required interface or doesn't.

Should leaders be worried or excited?

Both. Here's how I'd break it down.

If you're a business leader watching this, the first-order question is simple: what software in my stack could be rebuilt this way? If you're running on a framework or platform with complete test coverage and extensive documentation, the answer might be "more than you think." That's exciting if you're the one doing the rebuilding. It's threatening if you're the one being rebuilt.

The second-order question is harder. What does it mean to run production systems on code no human understands?

I've spent 25 years building and operating technology businesses. My honest view is that we'll get comfortable with this faster than most people expect. We already run production systems on code that very few humans understand. Large legacy codebases, heavily abstracted microservice architectures, auto-generated configuration layers. The gap between "no human reviewed this" and "three humans sort of understand this but they left the company" is smaller than it sounds.

The real risk isn't that AI-generated code is uniquely dangerous. It's that organizations will skip the validation step. The test suite is what makes this work. Without rigorous automated criteria, you're generating plausible-looking garbage at scale. That's worse than writing bad code by hand, because at least with hand-written bad code, someone knows where the bodies are buried.

Here's what I'd tell any executive asking me about this:

Your test coverage is now a strategic asset. Not just for quality assurance. For your ability to use AI effectively, and for your vulnerability to competitors who might use AI against you.
Domain expertise becomes more valuable, not less. Faulner couldn't have done this without deep knowledge of Next.js internals. The AI did the labor. The human did the judgment. That ratio is the future.
Platform lock-in is weaker than it's ever been. If a competitor can rebuild your framework compatibility layer in a week, the switching cost you thought protected your business just evaporated.
"We don't understand the code" is going to become normal. The question isn't whether to accept that. It's how to build confidence through testing, monitoring, and containment rather than through human code review.

What does this mean for how we work with companies?

At Holm Intelligence Partners, a big part of what we do is help leadership teams figure out where AI actually changes their operational reality versus where it's just noise. This Cloudflare story is a perfect example of both.

The noise is the competitive drama. Cloudflare trolling Vercel. Vercel finding bugs and collecting bounties. Twitter taking sides.

The signal is that the cost of rebuilding software just dropped by orders of magnitude for certain categories of problems. That changes build-vs-buy calculations. It changes platform strategy. It changes how you think about technical moats.

If you're trying to figure out what this shift means for your specific business, that's the kind of work we do in our AI Operating Review. Not abstract strategy. Practical assessment of where AI changes your competitive position and what to do about it.

Where does this leave us?

Software is going to be increasingly generated, not written. The quality of that generation depends entirely on the quality of the criteria you give it. Test suites, specifications, validation frameworks. These become the real intellectual property.

The people who thrive won't be the ones who write the most code. They'll be the ones who define the right criteria and apply the right judgment to steer AI agents toward useful outcomes.

Cloudflare didn't prove that AI can replace software engineers. They proved that a skilled engineer with AI tools can do in one week what would have taken a team years. That's a different and much more important statement.

The question for every business leader is no longer "should we use AI for software development?" It's "do we have the validation infrastructure and domain expertise to use it well?"

If you don't, start building it. That's where the real advantage lives now.

Infographic

Frequently Asked Questions

How did Cloudflare rebuild Next.js so fast with AI?: Next.js has thousands of open-source end-to-end tests. Claude Code used that test suite as automated validation criteria: generate code, run the tests, fix what fails, repeat at machine speed. The test suite did the work that human code review normally does. Without it, the rebuild would not have been possible.
Is AI-generated production code actually safe to ship?: It depends entirely on your validation layer. Cloudflare shipped code no human reviewed, and Vercel found critical vulnerabilities within 24 hours. The risk is not that AI-generated code is uniquely dangerous. The risk is skipping rigorous automated testing. If your criteria are solid, the output can work. If your criteria are weak, you are generating plausible-looking garbage at scale.
Does this mean software engineers are being replaced?: No. Cloudflare proved that a skilled engineer with AI tools can do in one week what would have taken a team years. The engineer directing V-Next had deep Next.js internals knowledge. Without that judgment, the AI goes nowhere useful. Domain expertise becomes more valuable, not less. The ratio shifts: less coding, more thinking.
What does this mean for platform lock-in and vendor strategy?: Platform lock-in just got weaker. Next.js was effectively tuned for Vercel's infrastructure. Cloudflare and Netlify spent years trying to fix that through OpenNext and could not keep pace. V-Next solved it in a week. If a competitor can rebuild your compatibility layer that fast, the switching cost you thought protected your business is thinner than you assumed.
Why would open-source projects hide their test suites?: Because the test suite is what made Next.js cloneable in a week. The engineer who led V-Next predicted publicly that projects would start making their tests private, like SQLite already does. If your test suite gives competitors a ready-made validation framework to clone your project with AI, locking it down is a rational defensive move. The tradeoff is that it also hurts community contribution.
What should a business leader actually do in response to this?: Three things. First, audit your test coverage because it is now a strategic asset for both offense and defense. Second, invest in domain expertise, since human judgment is what steers AI agents toward useful outcomes. Third, rethink your platform assumptions, because build-vs-buy math and switching cost assumptions just shifted much for many software categories.