Thursday, February 5, 2026

The Boundaries of Large Language Models: Where AI Stops Working

Large language models (LLMs) have made remarkable progress, but they still have fundamental limitations due to their architecture, training data, and lack of certain cognitive abilities. Here are the key tasks LLMs cannot perform, along with the reasons why:

1. Tasks Requiring Real-World Interaction

Examples:

  • Physical actions: Operating machinery, conducting lab experiments, or performing surgery.
  • Real-time sensory perception: Seeing, hearing, or touching the physical world (e.g., identifying objects in a room or reacting to live audio).

Why? LLMs are text-in, text-out systems. They lack:

  • Embodiment: No sensors, actuators, or physical presence.
  • Real-time feedback loops: Cannot interact dynamically with environments outside of text or pre-processed data.

2. Tasks Requiring True Understanding or Consciousness

Examples:

  • Genuine comprehension: Understanding text the way humans do—with intent, emotions, or subjective experience.
  • Self-awareness: Recognizing its own existence, limitations, or desires.

Why? LLMs simulate understanding by predicting patterns in text. They:

  • Lack qualia (subjective experience) or theory of mind (understanding others’ mental states).
  • Cannot form beliefs, desires, or intentions -they generate responses based on statistical probabilities.

3. Tasks Requiring Up-to-Date or Private Knowledge

Examples:

  • Real-time information: Answering questions about events after the model’s last training update (e.g., “What happened in the stock market yesterday?”).
  • Accessing private data: Retrieving personal emails, internal company documents, or confidential databases.

Why? LLMs are static at the time of training. They:

  • Cannot browse the live web or access new data unless explicitly provided (e.g., via web search tools).
  • Have no memory of past interactions unless stored externally (e.g., chat history).

4. Tasks Requiring Complex Reasoning or Planning

Examples:

  • Multi-step logical puzzles: Solving novel math proofs or planning a multi-year business strategy with unknown variables.
  • Causal reasoning: Explaining why something happens at a deep, mechanistic level (e.g., “Why does this drug work at the molecular level?”).

Why? LLMs excel at pattern recognition, not structured reasoning. They:

  • Struggle with abstraction beyond surface-level correlations.
  • Cannot perform recursive self-improvement or hypothetical planning like humans.

Note: Tools like Wolfram Alpha or symbolic AI are often better for math/logic, while LLMs assist with explanations or generating hypotheses.

5. Tasks Requiring Ethical or Moral Judgment

Examples:

  • Making value-based decisions: Deciding what is “right” in ambiguous situations (e.g., medical triage, legal sentencing).
  • Aligning with human values: Resolving conflicts between cultural, personal, or societal norms.

Why? LLMs have no inherent values or ethics. They:

  • Reflect biases in training data.
  • Cannot justify decisions based on moral frameworks - only simulate what “sounds” ethical.

Example: An LLM might suggest a course of action, but it cannot feel empathy or take responsibility for outcomes.

6. Tasks Requiring Creativity Beyond Remixing

Examples:

  • Truly original art or ideas: Creating a groundbreaking scientific theory or a transformative art movement.
  • Inventing novel concepts: Designing a never-before-seen technology or philosophical framework.

Why? LLMs remix existing ideas—they don’t “invent” in the human sense. They:

  • Lack intentionality or purpose behind creation.
  • Rely on statistical novelty, not conceptual leaps.

Note: LLMs can assist creativity (e.g., brainstorming, drafting) but cannot replace human ingenuity.

7. Tasks Requiring Emotional Intelligence

Examples:

  • Genuine empathy: Comforting a grieving person with deep emotional understanding.
  • Negotiating complex social dynamics: Mediating a family conflict or leading a team through cultural change.

Why? LLMs simulate empathy using patterns from data. They:

  • Cannot experience emotions or build real relationships.
  • May generate plausible but hollow responses in sensitive contexts.

8. Tasks with High Stakes or Legal Accountability

Examples:

  • Medical diagnosis: Prescribing treatment without a doctor’s oversight.
  • Legal advice: Drafting binding contracts or representing someone in court.

Why? LLMs:

  • Are not certified or licensed professionals.
  • Cannot be held legally accountable for errors or omissions.

Best practice: Use LLMs as assistants, not replacements, for high-stakes tasks.

9. Tasks Requiring Long-Term Memory or Consistency

Examples:

  • Remembering user preferences: Recalling a user’s dietary restrictions across multiple sessions without external storage.
  • Maintaining narrative consistency: Writing a 1,000-page novel with coherent characters and plotlines over months.

Why? LLMs have no persistent memory. Each response is generated independently unless:

  • External tools (e.g., databases, vectors) store context.
  • Users provide repetitive reminders of past interactions.

10. Tasks Involving Unstructured or Noisy Data

Examples:

  • Analyzing raw sensor data: Interpreting live EEG brainwave signals or satellite imagery.
  • Processing ambiguous input: Understanding heavily accented speech or poorly scanned handwritten notes.

Why? LLMs are trained on clean, structured text. They:

  • Struggle with multi-modal data (e.g., combining text, audio, and video).
  • Require pre-processing for non-text inputs (e.g., OCR for images).

Solution: Hybrid systems (e.g., LLM + computer vision models) are often needed.

Summary Table: LLM Limitations

Task Type

Example

Why LLMs Fail

Workaround

Real-world interaction

Operating a robot

No sensors/actuators

Pair with robotics hardware

True understanding

Explaining consciousness

No subjective experience

Use as a research assistant

Up-to-date knowledge

Today’s news

Static training data

Integrate web search tools

Complex reasoning

Proving a math theorem

No symbolic logic

Combine with Wolfram Alpha or theorem provers

Ethical judgment

Deciding medical triage

No values or accountability

Use as a decision-support tool

Original creativity

Inventing a new physics theory

Remixes existing ideas

Assist human creators

Emotional intelligence

Counseling a trauma survivor

No genuine empathy

Augment with human oversight

High-stakes accountability

Diagnosing disease

No certification/liability

Use only under expert supervision

Long-term memory

Remembering user preferences

No persistent storage

Use external databases

Unstructured data

Analyzing live video feeds

Text-only input

Pair with specialized models (e.g., CV)

Key Takeaway

LLMs are powerful tools for text-based tasks - generating, summarizing, translating, and assisting - but they are not autonomous agents. For tasks requiring real-world action, deep reasoning, ethics, or creativity, LLMs should be part of a larger system (e.g., combined with humans, symbolic AI, or specialized tools).

Thursday, January 29, 2026

AI is a challenge of leadership instead of innovation

 

AI doesn’t fail primarily due to lack of ideas or technology. It fails because leaders don’t make the hard decisions AI forces into the open.

Innovation problems are about can we build it?
AI problems are about should we, where, and under what constraints?

That’s a leadership problem.

1. AI collapses the gap between decision and consequence

Traditional innovation lets leaders delegate:

  • Engineers build
  • Product experiments
  • Leaders review outcomes later

AI doesn’t allow that comfort.

  • AI executes decisions at scale
  • Errors propagate instantly
  • “Small” choices become policy

Leadership challenge

  • You must decide in advance what decisions are allowed to scale.
  • You own failures you didn’t personally approve line-by-line.

 

2. AI exposes organizational contradictions

AI systems force answers to questions leaders often avoid:

  • Do we value speed or safety?
  • Growth or trust?
  • Consistency or discretion?
  • Efficiency or employment?

Humans can navigate contradictions informally.
AI cannot.

Result

  • Leadership indecision becomes model ambiguity.
  • Political compromises turn into technical debt.

 

3. Innovation tolerates ambiguity. AI amplifies it.

Innovation thrives on exploration.
AI systems:

  • Act even when uncertain
  • Sound confident when wrong
  • Hide edge cases until damage occurs

Leadership failure mode

  • Treating AI like a prototype instead of an operational actor.
  • Confusing model accuracy with decision readiness.

 

4. AI shifts accountability upward, not downward

In classic innovation:

  • Failure belongs to the team.
  • Leaders sponsor and shield.

In AI:

  • Failures trigger legal, ethical, and reputational consequences.
  • “The model did it” is not a defense.

Hard truth

You cannot delegate moral agency to software.

That accountability sits with leadership whether acknowledged or not.

 

5. The real bottleneck is not data or models; it’s permission

Most AI programs stall because leaders won’t decide:

  • Which workflows can be automated
  • Which roles change
  • Which risks are acceptable
  • When humans must override the system

Teams can build models faster than leaders can grant authority.

 

6. AI forces explicit value tradeoffs

 

Innovation asks: What’s possible?
AI asks: What is acceptable?

Examples:

  • Fairness vs profitability
  • Transparency vs performance
  • Personalization vs privacy

These are normative decisions, not technical ones.

Only leaders can make them and be accountable.

 

7. AI success looks boring, not innovative

Well-led AI:

  • Quietly prevents bad decisions
  • Stops scaling the wrong things
  • Reduces variance, not creativity

Poorly led AI:

  • Demos well
  • Fails publicly
  • Surprises leadership

Innovation celebrates novelty.
Leadership values reliability.

AI rewards the second.

 

The core insight

AI is a mirror. It reflects leadership clarity or the lack of it - at machine speed.

If values, ownership, escalation paths, and risk tolerance are unclear, AI will surface that confusion faster than any other technology.

That’s why organizations with strong leadership but mediocre tech outperform those with brilliant models and weak governance.

 

A simple litmus test for leaders

If a leader cannot clearly answer:

  1. What decisions this AI is allowed to make
  2. What data it is allowed to use
  3. What failure looks like
  4. Who shuts it off
  5. Who apologizes publicly

They are not leading AI.
They are experimenting with it.

Wednesday, January 28, 2026

Is the era of the 'Best-of-Breed' giant over? Why the hyperscale oligarchy is coming for Salesforce, SAP, and the rest of the independent enterprises?

 

The answer is messier than either or. We’re heading toward a stratified market where both thrive, but in fundamentally different ways

Short answer: hyperscalers will dominate infrastructure and primitives, but they will not own the entire enterprise stack. There is still room for independent players, but only if they operate in very specific lanes and avoid pretending the old model still works.

Here’s an unsentimental breakdown:

1. What hyperscalers actually win (and why)

Hyperscalers (AWS, Azure, GCP) have structural advantages that compound:

A. Capital + utilization flywheel

  • Massive capex amortized across millions of customers
  • Better utilization → lower marginal cost → more investment
  • No independent vendor can match this curve

Result:
Compute, storage, networking, and AI infra become utilities.

B. Control of primitives

Hyperscalers own:

  • Cloud Infrastructure
    • Compute
    • Storage
    • Observability & Monitoring
  • Identity, Access, and Security Primitives
  • Data Warehousing & Core Analytics
  • Developer Platform & Runtime
  • AI runtimes

These are horizontal, non-negotiable layers.

Once primitives are owned, everything above is pressured.

D. Developer Platforms

Developers follow gravity and hyperscalers are a new shiny thing on the horizon

  • IDE and allied tools
  • Container orchestration
  • API Gateway
  • Low-code development will entice non-programmers into programming
    • Workflow orchestration
    • Ad-hoc Platform integration

E. Distribution power

  • One-click procurement
  • Integrated security and compliance
  • Enterprise trust at the CIO level

Result:
Anything that looks like “undifferentiated plumbing” gets absorbed.

2. Where hyperscalers fail (systemically)

Hyperscalers struggle with deep, opinionated domain-specific software – producing polished products.

Not accidentally but structurally.

Why:

  • They optimize for breadth, not depth
  • Products must serve incompatible customer needs
  • Internal incentives reward infra leverage, not domain mastery
  • Regulatory risk pushes them toward neutrality

This creates a ceiling on:

  • ERP nuance
  • Industry-specific workflows
  • Mission-critical business logic
  • High-stakes compliance interpretation

Hyperscalers ship platforms. Enterprises run businesses.

Example: AWS has likely launched over 10 database services, but enterprises still pay Snowflake billions because Snowflake understood data warehouse users’ workflows in ways that AWS didn’t bother to. The hyperscalers ship features; independent vendors ship solutions.

3. The survivable lanes for independent giants (Oracle, SAP, Salesforce, etc.)

Independent enterprise giants survive only where all three conditions hold:

1. Domain lock-in is real, not contractual

  • Understanding of local tax laws and continuous updation
  • Own the workflows that run actual businesses - payroll, financial close, procurement, HR processes, Sales cycle, etc.
  • Industry regulations (healthcare, utilities, banking, insurance, defense, etc.)

If the cost of being wrong is existential, not inconvenient, hyperscalers back off.

2. The product encodes institutional knowledge

Software that embodies:

  • Decades of edge cases
  • Legal interpretations
  • Audit logic
  • Process memory
  • Internal politics embodied as organizational structure

This is representation learning, not CRUD.

Enterprise software is deeply embedded in work culture and politics.

3. Switching costs are cognitive, not technical and/or financial

APIs are easy to rewrite.
Mental models are not.

If users think in your system, you’re defensible.

4. Enterprise Software vendors: obsolete or underestimated?

Enterprise Software vendors are not dead but narrowing.

Where Enterprise Software vendors still win

  • Regulated enterprise workloads
  • High-scale transactional systems
  • Enterprises that value predictability over innovation

Enterprise Software vendors’ strength is not agility; their invariant-ability.

Where Enterprise Software Vendors lose

  • Developer mindshare
  • AI-native workflows
  • Anything that smells like commodity infra

5. The new equilibrium (2025–2035)

The enterprise stack is splitting into three layers:

Layer 1: Utilities (hyperscalers)

  • Compute
  • Storage
  • Networking
  • AI runtimes
  • Security primitives

Winner-take-most.

Layer 2: Platforms (contested)

  • Data platforms
  • Integration
  • Analytics
  • Workflow engines

Hyperscalers pressure here but don’t fully own it.

Layer 3: Systems of Record & Judgment (independent giants)

  • ERP
  • Financials
  • HR
  • Industry-specific cores

This layer cannot move fast without breaking reality.

That’s Enterprise Software vendors’ natural habitat.

6. The real threat is not hyperscalers - it’s collapse via false grokking

Independent giants don’t die because hyperscalers kill them.

They die because they:

  • Mistake contracts for moats
  • Optimize sales over learning
  • Ship abstractions divorced from real workflows
  • Stop encoding new reality

Hyperscalers apply pressure.
False grokking pulls the trigger.

7. The absorption heuristic (use this yourself)

Ask four questions:

  1. Is correctness universal or contextual?
    Universal → hyperscaler
    Contextual → independent
  2. Does value increase with scale or judgment?
    Scale → hyperscaler
    Judgment → independent
  3. Is the buyer optimizing cost or risk?
    Cost → hyperscaler
    Risk → independent
  4. Can failure be rolled back safely?
    Yes → hyperscaler
    No → independent

If you answer “hyperscaler” to 3+ of these, absorption is inevitable.

8. Final verdict

The future of enterprise software is not owned by hyperscalers, but it is bound by them.
The independent giants that survive will be those with genuine moats the hyperscalers can’t easily replicate: deep vertical expertise (Veeva in pharma), workflow lock-in (ServiceNow for ITSM), or network effects (Salesforce’s AppExchange ecosystem). They’ll increasingly run on hyperscaler infrastructure while providing the opinionated layer on top.

What’s genuinely threatened is the middle - companies selling undifferentiated infrastructure or horizontal tools without strong moats. Why buy a standalone monitoring tool when each hyperscaler offers something 80% as good that’s deeply integrated?

The future probably looks like: hyperscalers own the infrastructure and broad horizontal services, independent giants own the high-value vertical workflows with real lock-in, and a healthy ecosystem of specialized vendors serves niches too small for hyperscalers to care about.