Friday, March 20, 2026

From Speed to Endurance: How the AI Race is Changing Gears

 

Compute, Energy, and Talent are major bottlenecks in the AI race, which are affecting almost every economy across the world. These bottlenecks are no longer inputs - they are strategic choke points.

A. Bottlenecks

1. Compute: scale is becoming geopolitical

Compute is currently the most acute constraint. The leading AI labs are in an arms race for GPU clusters, and whoever can scale compute fastest gains a meaningful edge in training frontier models.

What’s happening

  • Frontier models now require tens to hundreds of thousands of GPUs.
  • Access is constrained by: Chip fabrication capacity, Export controls, Vendor concentration in a few geographies, and businesses
  • Training runs are measured in weeks of full-cluster time.

How does it affect the race

  • Winner-take-most dynamics: Massive compute requirements favor deep-pocketed players -  hyperscalers (Microsoft, Google, Amazon) and well-funded labs (Anthropic, OpenAI, xAI). Startups without cloud partnerships are largely priced out of frontier training. The biggest labs compound advantages.
  • Hardware chokepoints: NVIDIA holds a near-monopoly on AI training chips. AMD, Intel, and custom silicon (Google's TPUs, Amazon's Trainium) are gaining ground but remain behind. Export controls on advanced chips to China are reshaping geopolitical competition, pushing China to accelerate domestic chip development (Huawei's Ascend line).
  • Drives a "Design vs. Manufacture" Divide: The US excels at designing cutting-edge chips (NVIDIA's architecture) while Taiwan's TSMC is the world's leading manufacturer. Dutch holds the cards via ASML. China is trying to close this gap by investing billions into domestic fabrication (e.g., SMIC). The race is not just for more chips, but for control of the entire supply chain, from design software to fabrication to packaging.
  • Efficiency as a counter-move: Compute scarcity is also driving algorithmic efficiency - models like DeepSeek-R1 & Sarvam showed that smarter training approaches and efficient architecture can dramatically close the gap with less compute.
  • Smaller players shift to: Fine-tuning Distillation Model reuse
  • Nations treating compute as strategic infrastructure.

New reality

The question is no longer “who has the best model,” but “who can afford to run it continuously.”

2. Energy: intelligence is becoming a power problem

Energy is becoming the next critical bottleneck, especially as inference demands scale:

  • Data center power demand is projected to grow exponentially. Microsoft, Google, and Amazon are all striking long-term power purchase agreements and even restarting nuclear plants (e.g., Three Mile Island for Microsoft).
  • Geographic arbitrage: Labs and hyperscalers are establishing data centers near cheap power. Countries with abundant energy (Canada, Norway, Iceland, and the UAE) gain strategic relevance. UAE may lose due to the Iran war, while India may gain due to stability.
  • Grid constraints slow deployment: Even with capital, connecting a large data center to the grid can take 3–7 years due to permitting and infrastructure. This creates a hard ceiling on how fast anyone can scale, regardless of money.
  • Innovation in Cooling and Hardware: The bottleneck will drive rapid innovation in liquid cooling technologies and low-power chip designs. Companies that can reduce the wattage-per-token ratio will gain a massive competitive advantage in operational costs (OpEx).

What’s happening

  • Training + inference draw hundreds of megawatts.
  • Grid constraints delay data center buildouts.
  • Electrical components delay the buildout of the underlying infrastructure
  • Energy cost volatility directly affects model pricing.

Consequences

  • AI development clusters in the vicinity of: Cheap, stable power, friendly regulators, long-term energy contracts
  • Nuclear, hydro, and dedicated renewables gain strategic importance.
  • Energy efficiency becomes a competitive advantage, not a sustainability side project.

Hard truth

You can’t scale intelligence faster than you can scale electricity.

3. Talent: scarcity is shifting from “brilliant” to “battle-tested”

Talent is the most durable and hardest-to-replicate bottleneck:

  • Extreme concentration: The number of researchers capable of training and improving frontier models is likely in the low thousands globally. Competition for this group is fierce, with compensation packages routinely exceeding $1–5M/year at top labs.
  • Geographic clustering: Most frontier AI talent is concentrated in a handful of metros - SF Bay Area, London, NYC, Seattle, Toronto, and Beijing. Immigration policy matters enormously; the US benefits from attracting global talent, but visa friction creates opportunities for Canada, the UK, and the UAE (this may not be the case after the Iran war).
  • China's talent dynamics: China has a large pool of ML engineers but faces a brain drain of top researchers to Western labs, while export controls limit access to the best hardware - a compounding disadvantage at the frontier.
  • Wage Inflation and Mobility: The scarcity of experts in deep learning, systems optimization, and RLHF (Reinforcement Learning from Human Feedback) will continue to drive salaries to unprecedented levels. This creates a "brain drain" from academia and smaller startups to big tech, though some talent may flock to well-funded, mission-driven startups offering equity and autonomy.
  • Talent as a moat softens over time: As tooling improves and models assist in their own development, the number of people who can do meaningful AI work is growing rapidly, gradually diffusing this advantage.

What’s happening

  • Breakthrough algorithms are increasingly open.
  • The real scarcity is people who can: Train at scale without blowing up clusters, debug silent failures, ship reliable AI systems into production

Impact

  • Talent concentrates where: Research is appreciated, Compute is abundant, Real-world systems exist
  • Smaller orgs struggle not to invent but to operationalize.

Shift in advantage

Systems engineers > research stars.

B. The bottlenecks interact (and compound)

  • Efficiency innovations disrupt the race: Every time someone finds a way to do more with less (e.g., DeepSeek's training efficiency, inference optimization), it partially resets the advantage that heavy compute spending confers.
  • The bottlenecks shift over time: Compute was the binding constraint in 2022–2024; energy looks like it takes over in 2025–2028; data quality and algorithmic insight may dominate after that.
  • Nation-states are entering the field: The UAE (G42), Saudi Arabia (HUMAIN), France (Mistral with state backing), and India (SARVAM with state backing) are all trying to build sovereign AI capacity precisely because these bottlenecks create strategic dependencies they want to avoid.
  • Whoever solves energy fastest may win: If a lab or nation can reliably access 10–100 GW of power cheaply, the other bottlenecks become much more manageable - you can buy chips and hire talent if you have the infrastructure.
  • From Speed to Endurance: The early phase of the AI race was about who could iterate fastest. The next phase is about sustainability. Winners will be those who can secure long-term power contracts, maintain supply chains for chips, and retain key personnel over years, not just months.
  • Democratization via Specialization: While building foundation models will remain exclusive to the wealthy, the application layer may democratize. Smaller players will fine-tune existing open-weight models for niche verticals, bypassing the need for massive compute clusters and focusing instead on domain-specific talent and data.

These constraints reinforce each other:

  • Compute without energy is idle.
  • Energy without talent is wasted.
  • Talent without compute is frustrated.

Result

  • Vertical integration wins.
  • Partnerships replace pure competition.
  • Platform control matters more than algorithmic novelty.

·        Strategic outcomes of these bottlenecks

    a. Consolidation at the frontier

  • Few global labs push state-of-the-art.
  • Most innovation happens downstream (applications, agents, workflows).

    b. Regional AI blocks

  • AI capability clusters by: Energy availability, Political alignment, Supply chain access

     c. Efficiency becomes the new arms race

  • Smaller, cheaper, faster models
  • Hardware-aware architecture
  • On-device and hybrid inference

     d. “AI nationalism” without borders

  • Governments influence via: Power policy, Export controls, Talent visas, Cloud regulation

C. Who wins under bottleneck conditions?

Not necessarily who:

  • Has the best model
  • Publishes the most papers
  • Moves first

But who:

  • Allocates compute ruthlessly
  • Pairs AI growth with energy strategy
  • Builds organizations that can absorb talent

Choosing where not to compete

D. The core insight

The AI race is becoming an infrastructure race disguised as an innovation race.

The next decade will reward leaders who think like:

  • Grid planners
  • Supply-chain strategists
  • Systems operators

Not just technologists.

No comments:

Post a Comment