There's a problem the AI labs don't put on the keynote slides. The big language models are running low on fuel.
They learned to write by reading almost everything humans have ever published. Books, code, forums, the whole internet. That well is close to dry. You can't double the size of human writing on demand, and the easy answer - train models on text other models wrote - quietly poisons them. Quality degrades. The industry has a name for it now: model collapse. (I unpacked that failure mode here → Habsburg Jaw in making - Model Collapse in AI.)
So the question that actually decides the next decade isn't "how much bigger can the models get." It's "where does the next training data come from when the internet runs out?"
The most serious answer going right now is this: stop collecting data. Start generating experience. That's what world models do.
A different kind of model
A language model predicts the next word. A world model predicts the next state of an environment, what happens when you turn the wheel, drop the glass, brake on ice. It learns physics and cause-and-effect, then lets a machine rehearse actions inside its own simulation before doing them for real.
The idea is older than the hype. David Ha and Jürgen Schmidhuber published a paper called "World Models" back in 2018, where an agent learned to play a game inside its own dreamed-up version of it. What's new is that the models finally got good enough to matter outside a lab and that the data wall gave everyone an urgent reason to care.
2026 is when it left the lab
Watch what shipped in the last six months, because the timing isn't a coincidence:
🔹 Google DeepMind released Project Genie to the public in January. Type a sentence, walk around a playable 3D world in real time. By May, it connected to Street View.
🔹 Nvidia launched Cosmos 3 on June 1 - an open model built specifically to train robots and self-driving cars inside generated worlds, with a coalition of robotics companies around it.
🔹 Waymo built its own world model in February to create the dangerous driving situations it can't safely film on real roads. Wayve shipped GAIA-3 for the same reason.
🔹 Fei-Fei Li's World Labs opened Marble to the public. Odyssey raised $310M.
And the loudest signal: Yann LeCun bet his next chapter on the claim that scaling language models is a dead end, and that models which learn the structure of the world are the real path forward.
These aren't six companies chasing a demo. There are six answers to the same shortage.
Why this is a business story, not a robotics one
Here's the move that matters for anyone running a company. Real-world data is slow, expensive, and often impossible to gather - a billion driving miles, a million robot grasps, the rare disaster you can't stage. A world model lets you manufacture that experience. Simulate the edge cases by the million, overnight, for the cost of compute instead of years and lives.
If that sounds abstract, connect it to something you already use: the digital twin. A digital twin tells you what your factory or supply chain looks like right now. A world model is the layer that predicts what it does next - so you can test a change a thousand ways before spending a dollar in the real world.
That's why this spreads well past cars and robots: manufacturing lines, warehouse logistics, supply-chain shocks, financial scenarios, anywhere with humanoids or autonomy on the roadmap. The common thread is rehearsal - deciding after you've seen it play out, not before.
The divide it creates
For three years, the AI advantage was about generating content faster. The next advantage is quieter and harder to copy: generating experience. The companies that learn to simulate their own reality will train, test, and decide faster than the ones still waiting to collect data from the real one. That gap compounds the way cloud and data infrastructure did - invisibly, until a competitor is simply moving at a speed you can't match and you can't quite explain why.
LLMs hit a data wall. World models walk around it by building their own.
So the question worth sitting with: when the easy data runs out, will your company still be waiting to collect more - or generating what it needs?




