Friday, March 20, 2026

From Speed to Endurance: How the AI Race is Changing Gears

 

Compute, Energy, and Talent are major bottlenecks in the AI race, which are affecting almost every economy across the world. These bottlenecks are no longer inputs - they are strategic choke points.

A. Bottlenecks

1. Compute: scale is becoming geopolitical

Compute is currently the most acute constraint. The leading AI labs are in an arms race for GPU clusters, and whoever can scale compute fastest gains a meaningful edge in training frontier models.

What’s happening

  • Frontier models now require tens to hundreds of thousands of GPUs.
  • Access is constrained by: Chip fabrication capacity, Export controls, Vendor concentration in a few geographies, and businesses
  • Training runs are measured in weeks of full-cluster time.

How does it affect the race

  • Winner-take-most dynamics: Massive compute requirements favor deep-pocketed players -  hyperscalers (Microsoft, Google, Amazon) and well-funded labs (Anthropic, OpenAI, xAI). Startups without cloud partnerships are largely priced out of frontier training. The biggest labs compound advantages.
  • Hardware chokepoints: NVIDIA holds a near-monopoly on AI training chips. AMD, Intel, and custom silicon (Google's TPUs, Amazon's Trainium) are gaining ground but remain behind. Export controls on advanced chips to China are reshaping geopolitical competition, pushing China to accelerate domestic chip development (Huawei's Ascend line).
  • Drives a "Design vs. Manufacture" Divide: The US excels at designing cutting-edge chips (NVIDIA's architecture) while Taiwan's TSMC is the world's leading manufacturer. Dutch holds the cards via ASML. China is trying to close this gap by investing billions into domestic fabrication (e.g., SMIC). The race is not just for more chips, but for control of the entire supply chain, from design software to fabrication to packaging.
  • Efficiency as a counter-move: Compute scarcity is also driving algorithmic efficiency - models like DeepSeek-R1 & Sarvam showed that smarter training approaches and efficient architecture can dramatically close the gap with less compute.
  • Smaller players shift to: Fine-tuning Distillation Model reuse
  • Nations treating compute as strategic infrastructure.

New reality

The question is no longer “who has the best model,” but “who can afford to run it continuously.”

2. Energy: intelligence is becoming a power problem

Energy is becoming the next critical bottleneck, especially as inference demands scale:

  • Data center power demand is projected to grow exponentially. Microsoft, Google, and Amazon are all striking long-term power purchase agreements and even restarting nuclear plants (e.g., Three Mile Island for Microsoft).
  • Geographic arbitrage: Labs and hyperscalers are establishing data centers near cheap power. Countries with abundant energy (Canada, Norway, Iceland, and the UAE) gain strategic relevance. UAE may lose due to the Iran war, while India may gain due to stability.
  • Grid constraints slow deployment: Even with capital, connecting a large data center to the grid can take 3–7 years due to permitting and infrastructure. This creates a hard ceiling on how fast anyone can scale, regardless of money.
  • Innovation in Cooling and Hardware: The bottleneck will drive rapid innovation in liquid cooling technologies and low-power chip designs. Companies that can reduce the wattage-per-token ratio will gain a massive competitive advantage in operational costs (OpEx).

What’s happening

  • Training + inference draw hundreds of megawatts.
  • Grid constraints delay data center buildouts.
  • Electrical components delay the buildout of the underlying infrastructure
  • Energy cost volatility directly affects model pricing.

Consequences

  • AI development clusters in the vicinity of: Cheap, stable power, friendly regulators, long-term energy contracts
  • Nuclear, hydro, and dedicated renewables gain strategic importance.
  • Energy efficiency becomes a competitive advantage, not a sustainability side project.

Hard truth

You can’t scale intelligence faster than you can scale electricity.

3. Talent: scarcity is shifting from “brilliant” to “battle-tested”

Talent is the most durable and hardest-to-replicate bottleneck:

  • Extreme concentration: The number of researchers capable of training and improving frontier models is likely in the low thousands globally. Competition for this group is fierce, with compensation packages routinely exceeding $1–5M/year at top labs.
  • Geographic clustering: Most frontier AI talent is concentrated in a handful of metros - SF Bay Area, London, NYC, Seattle, Toronto, and Beijing. Immigration policy matters enormously; the US benefits from attracting global talent, but visa friction creates opportunities for Canada, the UK, and the UAE (this may not be the case after the Iran war).
  • China's talent dynamics: China has a large pool of ML engineers but faces a brain drain of top researchers to Western labs, while export controls limit access to the best hardware - a compounding disadvantage at the frontier.
  • Wage Inflation and Mobility: The scarcity of experts in deep learning, systems optimization, and RLHF (Reinforcement Learning from Human Feedback) will continue to drive salaries to unprecedented levels. This creates a "brain drain" from academia and smaller startups to big tech, though some talent may flock to well-funded, mission-driven startups offering equity and autonomy.
  • Talent as a moat softens over time: As tooling improves and models assist in their own development, the number of people who can do meaningful AI work is growing rapidly, gradually diffusing this advantage.

What’s happening

  • Breakthrough algorithms are increasingly open.
  • The real scarcity is people who can: Train at scale without blowing up clusters, debug silent failures, ship reliable AI systems into production

Impact

  • Talent concentrates where: Research is appreciated, Compute is abundant, Real-world systems exist
  • Smaller orgs struggle not to invent but to operationalize.

Shift in advantage

Systems engineers > research stars.

B. The bottlenecks interact (and compound)

  • Efficiency innovations disrupt the race: Every time someone finds a way to do more with less (e.g., DeepSeek's training efficiency, inference optimization), it partially resets the advantage that heavy compute spending confers.
  • The bottlenecks shift over time: Compute was the binding constraint in 2022–2024; energy looks like it takes over in 2025–2028; data quality and algorithmic insight may dominate after that.
  • Nation-states are entering the field: The UAE (G42), Saudi Arabia (HUMAIN), France (Mistral with state backing), and India (SARVAM with state backing) are all trying to build sovereign AI capacity precisely because these bottlenecks create strategic dependencies they want to avoid.
  • Whoever solves energy fastest may win: If a lab or nation can reliably access 10–100 GW of power cheaply, the other bottlenecks become much more manageable - you can buy chips and hire talent if you have the infrastructure.
  • From Speed to Endurance: The early phase of the AI race was about who could iterate fastest. The next phase is about sustainability. Winners will be those who can secure long-term power contracts, maintain supply chains for chips, and retain key personnel over years, not just months.
  • Democratization via Specialization: While building foundation models will remain exclusive to the wealthy, the application layer may democratize. Smaller players will fine-tune existing open-weight models for niche verticals, bypassing the need for massive compute clusters and focusing instead on domain-specific talent and data.

These constraints reinforce each other:

  • Compute without energy is idle.
  • Energy without talent is wasted.
  • Talent without compute is frustrated.

Result

  • Vertical integration wins.
  • Partnerships replace pure competition.
  • Platform control matters more than algorithmic novelty.

·        Strategic outcomes of these bottlenecks

    a. Consolidation at the frontier

  • Few global labs push state-of-the-art.
  • Most innovation happens downstream (applications, agents, workflows).

    b. Regional AI blocks

  • AI capability clusters by: Energy availability, Political alignment, Supply chain access

     c. Efficiency becomes the new arms race

  • Smaller, cheaper, faster models
  • Hardware-aware architecture
  • On-device and hybrid inference

     d. “AI nationalism” without borders

  • Governments influence via: Power policy, Export controls, Talent visas, Cloud regulation

C. Who wins under bottleneck conditions?

Not necessarily who:

  • Has the best model
  • Publishes the most papers
  • Moves first

But who:

  • Allocates compute ruthlessly
  • Pairs AI growth with energy strategy
  • Builds organizations that can absorb talent

Choosing where not to compete

D. The core insight

The AI race is becoming an infrastructure race disguised as an innovation race.

The next decade will reward leaders who think like:

  • Grid planners
  • Supply-chain strategists
  • Systems operators

Not just technologists.

Sunday, March 8, 2026

Steering Large Language Models: Shaping Behavior at Inference Time

 

Introduction

The dominant paradigm for customizing a Large Language Model (LLM) has long revolved around two techniques:

·       fine-tuning (retraining the model on new data to alter its weights) and

·       prompt engineering (carefully crafting inputs to elicit desired outputs).

Both have proven powerful, but both carry meaningful limitations. Fine-tuning is expensive, requires labeled data, and can cause catastrophic forgetting. Prompt engineering is brittle, easily bypassed, and opaque - it shapes what the model sees, not what it fundamentally does.

A third approach has emerged that operates differently from either: activation steering, also called representation engineering or simply steering. Rather than changing the model's weights or its inputs, steering intervenes directly in the model's internal computational state during a forward pass. It reaches inside the model's residual stream - the flowing, high-dimensional representation of meaning that passes between transformer layers and nudges it in a direction associated with a target concept, behavior, or personality trait.

The result is a model whose outputs are shaped from within, in real time, without any modification to its parameters and without any mention of the intended behavior in the prompt.

Steering operates directly on the model’s hidden states or activations.

·       You’re not changing what the model knows.

·       You’re influencing how it uses what it knows.

Steering is especially powerful when:

  • You want real-time personality switching
  • You need strong behavioral guarantees
  • You want low-latency adjustments
  • You must avoid retraining for compliance reasons
  • You operate at scale with shared base models

In short: steering enables programmable cognition.

How LLMs Represent Meaning Internally

To understand steering, it helps to understand what is actually happening inside a transformer model when it processes text.

At each layer of a transformer, the model maintains a residual stream: a vector of floating-point numbers (often thousands of dimensions wide) for each token in the sequence. This vector accumulates information as it passes through attention heads and feed-forward networks at successive layers. By the final layer, this representation encodes everything the model "knows" about that token in context — its meaning, its emotional valence, its relation to prior tokens, and more.

Research in mechanistic interpretability has demonstrated that these high-dimensional vectors are not random or uninterpretable. Specific directions in this space correspond to identifiable concepts. A direction might encode "this text is formal," another might encode "this statement is a refusal," another might correspond to "the speaker is angry." These directions are often linearly separable - meaning you can find a vector that, when added to or subtracted from a layer's activations, reliably shifts the model's behavior along a conceptual axis.

This is the foundation of steering.

What Is a Steering Vector?

A steering vector is a direction in a model's activation space that corresponds to a particular concept, behavior, or personality trait. Once identified, adding a scaled version of this vector to the model's residual stream at one or more layers during inference shifts the model's outputs toward (or away from) that concept - without any change to weights or prompts.

How Steering Vectors Are Extracted

The most common method for extracting a steering vector is the contrast pair approach:

  1. Collect contrast pairs. Prepare a set of input prompts that differ only in the presence or absence of the target concept. For example, to find a "sycophancy" direction, you might collect outputs where the model agrees with a false statement (sycophantic) versus outputs where it corrects the record (honest).
  2. Record activations. Run both sets of inputs through the model and record the residual stream activations at a chosen layer (or set of layers) for each.
  3. Compute the difference. Subtract the mean activation of the "without" class from the mean activation of the "with" class. The resulting vector is the steering direction for that concept.
  4. Normalize. Scale the vector to unit norm. At inference time, it can be multiplied by a scalar coefficient (the steering strength) to control the magnitude of the intervention.

This approach is also called activation addition.

An alternative is probing: training a linear classifier on activations to distinguish between two classes, then using the classifier's weight vector as the steering direction. This is slightly more principled but also more data-hungry.

The Mechanics of Steering at Inference

At inference time, steering works as follows:

  1. The model begins a standard forward pass on the input tokens.
  2. At a designated layer (e.g., layer 15 of a 32-layer model), the residual stream activations are intercepted.
  3. A scaled steering vector is added to (or subtracted from) the activation at that layer for every token (or a selected subset of tokens).
  4. The modified activations continue through the rest of the forward pass as normal.
  5. The model generates its output based on this modified internal state.

This is sometimes called representation intervention because it intervenes directly in the model's representation, rather than in its inputs or weights.

The layer choice matters considerably. Early layers tend to encode surface-level syntactic features; middle layers tend to encode semantic concepts; late layers encode task-specific, generation-oriented information. Steering for behavioral traits often works best in mid-to-late layers.

The steering strength (the scalar multiplier on the vector) controls intensity. Low values produce subtle nudges; high values can dramatically alter tone, content, or even coherence. Too strong a vector can destabilize the model's outputs entirely, causing incoherence or repetitive loops.

Concrete Examples

Example 1: Steering for Happiness / Emotional Tone

Setup: Using a contrast set of "happy" vs. "neutral" text completions, researchers extract a "happiness" vector from layer 20 of a GPT-2 style model.

At inference:

  • Prompt: "The weather outside is"
  • Without steering: "...cloudy and cold, with rain expected through the evening."
  • With +happiness steering: "...absolutely beautiful -  warm sunshine, a gentle breeze, and the kind of day that makes everything feel possible."
  • With -happiness steering: "...grim and oppressive, the kind of grey that seeps into your bones and reminds you nothing lasts."

The prompt is identical in all three cases. Only the internal state differs.

Example 2: Suppressing Refusals (Safety Research Context)

This example is documented in the interpretability literature and is studied precisely because of its safety implications.

Setup: Researchers identify a "refusal" direction in models trained with RLHF safety fine-tuning by contrasting activations on prompts that elicit refusals versus prompts that elicit helpful completions.

Finding: Subtracting this direction from the residual stream can suppress the model's tendency to refuse certain requests — even requests that would normally trigger a safety response. This does not mean the model "has no values," but rather that the safety behavior is partly implemented as a localized direction in activation space that can be disrupted.

Implication for safety research: This finding motivates designing safety behaviors that are more distributed across the model's computation, rather than concentrated in a single linear direction that could be easily steered away.

Example 3: Personality and Communication Style

Setup: A product team wants a customer-facing assistant to be consistently warm, empathetic, and informal -  without relying on a long system prompt that could be easily manipulated or overridden.

Method: They extract a "warmth" vector from a set of contrast pairs:

  • Warm responses: "Oh, I completely understand how frustrating that must be! Let's sort this out together."
  • Cold/neutral responses: "Your issue has been logged. A representative will respond within 48 hours."

They then apply a moderate positive scalar of this vector at layers 16–20 of their deployed model.

Result: Every completion the model produces - regardless of topic - carries a slightly warmer, more empathetic register. The effect is consistent, doesn't consume context window, and cannot be bypassed by adversarial prompts the way system prompt instructions can.

Example 4: Reducing Sycophancy

One of the most-studied applications of steering in alignment research is sycophancy reduction - making models less likely to agree with false or biased claims simply because the user asserted them.

Setup: Contrast pairs are constructed as:

  • Sycophantic: Model agrees with a user who confidently states a false fact ("You're right, Napoleon was over 6 feet tall")
  • Honest: Model politely corrects the user

Steering application: Adding the "honest" direction (or subtracting the "sycophantic" direction) during inference increases the model's tendency to maintain accurate positions even under social pressure.

Result: In evaluations, steered models are significantly less likely to update their stated beliefs when users push back with false confidence - a behavior that can erode the utility and trustworthiness of deployed assistants.

Example 5: Concept Injection Without In-Context Examples

Traditional in-context learning requires including examples of the desired behavior directly in the prompt. Steering can replicate some of this without using prompt tokens.

Setup: A researcher wants a model to respond as if it is in a "formal academic writing" mode.

Method: Rather than adding instructions like "Write in a formal academic tone" to the prompt (which costs tokens and can be ignored), they extract a "formal academic register" vector and apply it at inference.

Result: Even a bare prompt like "Explain photosynthesis" yields a response that reads like a textbook entry, with appropriate hedging, citation-style language, and structured argumentation — purely due to the internal steering, with no prompt modification.

Steering vs. Other Approaches: A Comparison

Dimension

Fine-Tuning

Prompt Engineering

Activation Steering

Modifies weights?

Yes

No

No

Requires training data?

Yes

No

Small contrast set

Consumes context window?

No

Yes

No

Bypassable by adversarial prompts?

Somewhat

Easily

Much harder

Interpretable?

Low

Medium

High

Requires redeployment?

Yes

No

No

Precision of control?

Broad

Narrow

Medium

Risk of destabilization?

Low

Very low

Medium (at high strength)

 

Limitations and Challenges

Steering is powerful, but it is not without significant limitations.

Instability at high magnitudes. Steering vectors applied with too large a coefficient can collapse model outputs into incoherence, repetition, or nonsensical text. The relationship between steering strength and output quality is nonlinear and model-dependent.

Layer sensitivity. The optimal layer for applying a steering vector varies by concept and by model architecture. What works at layer 16 may fail at layer 10 or layer 22. This requires empirical tuning.

Interference between vectors. Applying multiple steering vectors simultaneously can produce unpredictable interactions, since the vectors may not be orthogonal in activation space.

Polysemy of directions. A direction identified for "concept A" may also carry information about "concept B" if the two are correlated in the training data. Steering for one can inadvertently amplify or suppress the other.

Generalization limits. Steering vectors extracted from one distribution of prompts may not generalize perfectly to all prompts. A "happy" direction extracted from descriptive text may behave differently when applied to instructions or code.

Adversarial robustness is not guaranteed. While harder to bypass than prompt-based defenses, steering vectors can in principle be identified and countered by a sufficiently sophisticated adversary with white-box access to the model.

Ethical dual-use. The same techniques that enable safety researchers to identify and reinforce beneficial behaviors can be used to suppress them. A "refusal suppression" vector is a useful tool for auditing safety mechanisms and also a potential tool for circumventing them.

Risks

·       Requires Model Access

You need access to hidden states.
Closed APIs rarely allow this.

·       Trade-Off Curves

Strong steering can:

·        Reduce fluency

·        Increase verbosity

·        Harm reasoning quality

There is always a tuning coefficient.

·       Interpretability Is Imperfect

Activation space is high-dimensional.
Vectors may entangle multiple behaviors.

·       Security Concerns

If steering layers are exposed:

·        Malicious actors could override safety

·        Reverse-engineer control vectors

This requires architectural safeguards.                                                                                               

Strategic Implications

Steering changes the economics of AI deployment.

Instead of:

  • Maintaining multiple fine-tuned models

You can:

  • Maintain one base model
  • Apply runtime cognitive modulation

This reduces:

  • Training cost
  • Versioning complexity
  • Deployment risk

It also enables something bigger:

AI systems that are policy-configurable without retraining.

That’s a major shift in governance design.

Conclusion

Activation steering represents a fundamental shift in how we think about controlling the behavior of large language models. Rather than shaping outputs by modifying what a model is trained on, or by carefully constructing what it is told, steering shapes outputs by directly modifying what a model internally represents during inference.

It is, in a sense, the most direct form of behavioral control yet developed: not prompting a model to be honest, not training it to be honest, but reaching into the computational stream where honesty is encoded and amplifying that signal directly.

This directness makes steering both a powerful tool and a revealing window. The same technique that lets engineers reliably produce warmer, more honest, or less sycophantic outputs also exposes the degree to which LLM behaviors are localized, linear, and — for better or worse - surgically modifiable. Understanding these levers is not only a capability question. It is increasingly a safety question, and one of the most active frontiers in the science of making AI systems that behave as intended.

References and further reading:

·       Turner et al. (2023), "Activation Addition: Steering Language Models Without Optimization"

·       Zou et al. (2023), "Representation Engineering: A Top-Down Approach to AI Transparency"

·       Hernandez et al. (2023), "Linearity of Relation Decoding in Transformer Language Models"

·       Li et al. (2023), "Inference-Time Intervention: Eliciting Truthful Answers from a Language Model"