[Two Cents #78] “Flights of Thought” on Consumer + AI — Part 4: GPT-5

Aug 11, 2025

Introduction

It’s becoming clear that the market’s “readiness” for Consumer AI has crossed a tipping point.

What we need now is to get far more concrete about how AI-driven market change will unfold—the direction, the mechanisms, and the implications for industry structure, competitive dynamics, and the economics between participants.

For founders, the job is to identify those opportunities a little earlier and move first. For investors, the job is to recognize those early moves quickly and support them aggressively.

This series—my “Flights of Thought”—is an attempt to share how I’m thinking through what will happen, what it will unlock, and what kinds of ideas are likely to matter.

First things first — taking a quick breather after GPT-5.

GPT-5 vs. Consumer + AI

Plenty of people are already going deep on GPT-5’s benchmark gains, its step-up in multi-shot reasoning, and its genuinely strong one-shot coding. I won’t repeat that here.

Instead, I want to focus on the parts that matter most for Consumer + AI—the changes that could reshape product surfaces, distribution, and ultimately consumer behavior.

Two things stood out to me:

Mixture of Models (MoM) + a Model Router
Tools as “generalized agents”

1) Mixture of Models (MoM) + Model Router

MoE vs. MoM

With GPT-4, scale mattered, but I’d call that mostly incremental. The bigger conceptual shift was that Mixture of Experts (MoE) became broadly understood and adopted as a pattern.

In MoE, the “experts” are structurally similar subsystems with roughly comparable capacity, each specialized by domain. The router’s primary purpose is largely engineering optimization—only activating the necessary experts to reduce inference cost and latency. In that sense, the heart of MoE is “efficient compute.”

GPT-5’s Mixture of Models (MoM) feels meaningfully different. Here, the system is a collection of sub-models that can be quite different from each other—different sizes, different capabilities, potentially different modes of reasoning. The router’s core job is no longer just efficiency. It becomes an orchestrator: deciding how deep the system needs to think, handing work to the right model(s), evaluating outputs, and then delegating follow-on work—sometimes to the same model, sometimes to a different one.

If you compare it to how GPT-5 seems to process internally, it’s effectively multi-shot by design, which naturally supports orchestration across “thinking steps.”

Router as an orchestrator

This is structurally similar to a multi-agent system: break a task apart, delegate pieces to different agents, collect results, and iterate until the desired outcome is achieved.

In other words, the MoM router looks like a generalized version of what today’s agentic orchestration frameworks try to do—except applied across models with different “strength profiles.”

If you generalize further: GPT-5 is implicitly proposing a new default interface for how we will serve consumers (and enterprises) as the number of available models explodes. The world won’t be “one model.” It will be many models—small vs. frontier, reasoning vs. non-reasoning, on-device vs. cloud, multimodal specialists, world models, etc.—composed into a system. The router becomes the single point of entry and the unified UI/UX abstraction for that system—and, importantly, it shapes the economics.

A small but telling observation from using GPT-5 ChatGPT: it often “over-reasons.” For tasks where a lightweight summary would be enough, it sometimes goes full consultant—writing Python code and producing an overbuilt solution. It feels like asking a new hire to pull a quick datapoint and getting a 30-page report back the next morning.

That suggests a few implications:

We may need something like a “reasoning temperature”—a consumer-facing control that tunes the depth of reasoning the router chooses (analogous to creativity temperature).
Since developers can access individual GPT-5 models via API, it’s likely we’ll soon see alternative UX layers that effectively replace or compete with the default router behavior—especially once the ecosystem starts optimizing for different latency/cost/quality tradeoffs. (This has a bit of “early open model ecosystem” déjà vu.)
The bigger opportunity is personalized routing: the router reading user intent and context, then choosing models and depth dynamically based on what this specific user actually wants. If router competition opens up—especially via OSS—this kind of personalization will arrive quickly.

The “AI super app” lens

From a UI standpoint, this is also a step closer to the true meaning of an AI super app.

Open ChatGPT today and you can already feel the direction: the model list collapses, “GPT-5” becomes the default, and soon the user likely won’t care what model is behind the curtain. The interface becomes the product. The model becomes infrastructure.

From there, it’s not hard to extend the thought experiment.

Scenario 1: ChatGPT becomes the primary consumer interface

If ChatGPT can leapfrog existing default surfaces (today: iPhone + Siri), it starts to look like the “AI super app”—or even the “AI super device,” the iPhone of the AI era.

One plausible path:

Even today, the iPhone Action Button can bypass Siri and open a default assistant workflow.
A future AI device (the Jony Ive direction) could keep consumers always connected to ChatGPT with always-available voice input.
Output can remain flexible: for the next few years, the iPhone screen will still be the dominant display, but output could also route to home mirrors/TVs, car systems, ambient speakers, etc.

Scenario 2: personalized LLMs + a personal router

A more speculative—but not crazy—scenario is the emergence of personalized LLMs fine-tuned on a person’s data: calendar, contacts, email, health data, family graph, and more.

And it likely won’t be a single model. You could imagine multiple personalized models by “domain of self”: personal, family, work—each evolving as your job and life changes.

Technically, this doesn’t feel like the main barrier, and cost will likely fall into an affordable range over time. The bottleneck is data: collecting it, de-siloing it, and establishing ownership/control—especially for fragmented first-party ecosystems (health data, playlists, purchase history, etc.).

If personalized models exist—and if “my router” can access and orchestrate them—then a truly personalized assistant becomes possible: not just context-injected personalization per prompt, but a persistent personalized intelligence environment.

In that sense, the model router could be the first step toward that future.

(I’ll go deeper on hyper-personalization in the next post.)

Further generalization: routing as cost optimization

Long-term, the router is not just a UI abstraction. It could become the core mechanism for optimizing cost per unit of intelligence.

Extrapolate the system:

on-device small LLMs (<10B)
mid-sized models (100B–300B)
frontier-scale models (300B–1T+)
small/large reasoning variants
multimodal specialists
world models

A router could choose the cheapest adequate path based on task type and difficulty—possibly with the router itself running on-device as a small model.

If that pattern becomes standard (and it likely will, because it will be easy to replicate and commoditize), then we stop thinking “which model should I use?” The user just uses an AI system, the way we stopped worrying about browser compatibility and OS-level constraints as the stack matured. The model becomes like backend architecture: mostly invisible unless you’re building the infrastructure.

And once AI becomes truly ubiquitous at the consumer level, even the terms “LLM” and “AI” will fade from everyday language—like how we no longer say “I’m using electricity” or “I’m doing the internet.”

2) Tools as “generalized agents”

GPT-5’s tool calling has a few notable characteristics:

Free-form function calling (CFG-based)
Parallel tool calling

Free-form function calling isn’t conceptually brand new. MCP servers already resemble this pattern: the input is free text, and the tool interprets and executes actions.

The difference is that GPT-5 generalizes it. Instead of a single clean “prompt → API call → response,” the system can:

chain tools,
nest calls (including prompting other tools/models),
analyze results,
and decide what to call next.

At that point, “tool calling” starts to look less like API invocation and more like calling into agent subsystems. It’s an abstraction for interacting with semi-autonomous components, not just deterministic endpoints.

This is why some people describe GPT-5 as the “stone age” moment for agents: not because it merely uses tools, but because it begins to think with them—building workflows as part of reasoning.

Parallel tool calling reinforces the same direction. If tools are invoked concurrently and results are processed asynchronously, then the system increasingly resembles a world of many interacting agent-like subsystems rather than a single linear API pipeline.

If you push the idea to its logical extreme:

ChatGPT becomes the primary entry point, and behind it sits a generalized “agent-verse”—a multi-agent system that executes work in the background. That architecture also better supports the UI/UX shifts discussed in Part 2 (where interaction is not strictly request-response, but often headless, proactive, and agent-initiated).

A plausible “normal” flow could look like:

ChatGPT captures user needs from multiple channels (direct requests, signals, other agents).
Execution happens through ambient agents interacting asynchronously with platform agents (commerce) and brand agents.
The system returns options and asks for confirmation only where needed.

Consumer behavior: what changes?

My view is that consumer behavior shift in Consumer + AI will be driven less by “GPT-5 as a new model” and more by the shift toward multi-agent systems.

Once interaction moves away from pure request-response and toward headless + intent-driven workflows, consumer behavior will change quickly—if the system delivers real value: convenience, better selection, better price, less time spent.

And historically, when value is obvious, consumers adapt fast—even when the behavioral change is non-trivial (web → mobile is the clearest precedent).

So the significance of GPT-5, in this frame, is not that it immediately creates new consumer behavior. It’s that it adds key ingredients—routing and generalized tool/agent orchestration—that help ChatGPT become the dominant UI surface. Combine that with the coming multi-agent shift, and you start to see how consumer behavior change could accelerate.

AGI?

Does GPT-5 qualify as AGI?

OpenAI may have strategic reasons to want to declare “AGI reached,” but if we define AGI as “average human-level general intelligence,” I think we still need additional capabilities.

For example:

Temporally synchronized multimodality: not just seeing/hearing independently, but integrating modalities in a time-aligned way—potentially with a 3D world model.
Continuous incremental learning (and ideally self-learning).

By that bar, the path to AGI still looks long and steep 🙂

Call for Startups

The purpose of sharing this thinking is straightforward. As an early-stage investor focused on Consumer + AI, I hope this series helps existing startups better leverage AI-driven shifts—and helps new founders reduce trial-and-error as they search for meaningful opportunities.

In that sense, this is Two Cents’ version of a Call for Startups.

If you are an early-stage founder or startup in Consumer + AI and believe you are onto something, my inbox is always open. Feel free to reach out via DM or email:

hur at hanriverpartners dot com

Alter Two Cents

Discussion about this post

Ready for more?