[Two Cents #76] “Flights of Thought” on Consumer + AI — Part 2: UI, UX
Introduction
It’s becoming clear that the market’s “readiness” for Consumer AI has crossed a tipping point.
What we need now is to get far more concrete about how AI-driven market change will unfold—the direction, the mechanisms, and the implications for industry structure, competitive dynamics, and the economics between participants.
For founders, the job is to identify those opportunities a little earlier and move first. For investors, the job is to recognize those early moves quickly and support them aggressively.
This series—my “Flights of Thought”—is an attempt to share how I’m thinking through what will happen, what it will unlock, and what kinds of ideas are likely to matter.
Let’s start with the most fundamental question: where consumers will “play”, and what the primary UI surfaces and UX patterns will look like in that world.
The Primary UI Surface for Consumers
A few questions worth asking—thinking far forward:
Where will consumers primarily interact with AI? On the smartphone? Back to desktop? Or in ambient environments—cars, offices, living rooms, bedrooms, subways, sidewalks—spaces saturated with microphones, speakers, cameras, and sensors? Or through ambient devices—pins, AR glasses, wristbands?
And what will the dominant modality be in those environments? The current pattern—staring at a phone screen, uploading photos, typing? Voice conversations on the phone? If so, is the “counterparty” Siri? Or an ambient device’s Alexa-like interface? Or direct voice conversations with an AI app while looking at a screen? In an ambient setting, do we talk to a local device that routes requests, or does the cloud agent triage everything by default?
Before we reach that future, what becomes the dominant interface within today’s constraints? Does Siri (as a voice agent) become the primary router, with everything else running behind it? Or do ChatGPT/Claude become the new super-apps—and effectively the new app stores? Or do users continue to discover services by hunting for individual AI apps and URLs? Or does the browser reassert itself?
With those scenarios in mind, let’s return to the present and look at what’s already happening.
Battle for Attention
Several competing moves are unfolding in parallel.
1) LLM chatbots racing to become the “AI Super App” and the AI app store
ChatGPT and Claude are trying to own the consumer’s main UI surface.
They’re making the chatbot the default interface—and then expanding what’s reachable inside it: prompt-based apps (GPTs, artifacts) and workflow agents (Computer Use, Operators). Functionally, both are “AI apps.” The platform that aggregates them naturally becomes the app store layer as well.
If an LLM chatbot becomes the default super-app + app store, it captures the two most strategic assets at the UI layer—similar to Chrome on desktop or WeChat in China. Control the default surface, and you control consumer access to the broader Consumer + AI universe.
Even leaked strategy memos point in the same direction: the goal is to become the “super assistant” that functions as the primary interface to the internet.
At the same time, for the near term, the smartphone still acts as each person’s “server/center”—where activity accumulates and identity, permissions, and data live. And Apple/Google are unlikely to allow full OS-level third-party access to that center. Which means LLM chatbots will not have default access to the system agent (e.g., Siri) in any straightforward way.
Aside: this is why I suspect the OpenAI–Jony Ive device effort is about bypassing that constraint—potentially creating a hardware-level shortcut to become the default input layer (e.g., a microphone tethered to ChatGPT), or a path to “intercept” agent routing without OS-level permission.
2) The competing race: AI browsers
A second group is trying to win the UI surface through the browser itself.
We’re already seeing entrants like Perplexity’s Comet, The Browser Company’s Dia, and AI-browser-like products from Genspark and Fellou. OpenAI is also rumored to be working on an AI browser.
Aside: many “AI browsers” today feel closer to a repackaging of bookmarks and extensions than a truly independent browser stack. They may look like browsers in the current phase, but the harder question is whether these pseudo-browsers can actually hold the consumer surface once the real competition shifts into ecosystem capture—i.e., the AI app store battle. My read is that some players (e.g., Genspark) still look more like “agent platforms” than full super-app contenders, and it’s not obvious a startup can pursue both strategies at full intensity without stretching resources.
AI browsers can also build internal ecosystems—skills galleries, agent directories, app catalogs. Dia has already started; others are moving in a similar direction.
At the core, this is the same war as the chatbot war: who owns the consumer’s access path to the explosion of “AI apps” arriving as agents and prompt workflows.
This is still largely desktop-led today. Mobile versions will come. But browsers on mobile are constrained by OS-level limitations, which raises the question of whether an AI browser can realistically become the mobile super-app.
WeChat is the historical exception, but it required extraordinary, context-specific concessions from Apple early on due to China market dynamics. Given how uncomfortable that later made Apple—and how the power balance shifted—it’s hard to imagine that kind of exception happening again.
3) The mobile gatekeepers: Apple and Android
Apple hasn’t moved aggressively yet, but Google has. If you watch Google’s AI Mode demos, you can see deeper integration into the smartphone experience: beyond Lens, into live camera input, and into “what’s happening on the screen” across both desktop and mobile.
This matters because it makes two things possible for the gatekeeper:
Taking action on everything happening on the screen (search is the first obvious action; Apple will likely pursue a different mechanism).
Capturing that entire activity stream as data.
And that leads directly to the long game: attention control plus a proprietary personalization data layer—potentially fused with first-party data like Apple Health. In the long run, personalization data is one of the most durable moats in Consumer AI, and the OS gatekeeper is structurally best positioned to capture it.
The big variable is the one we started with: does the smartphone remain the primary consumer interface “until it isn’t”? If the surface shifts to new ambient devices (OpenAI–Ive, Meta AR glasses, or new startups), the outcome of this long game could change.
Multi-agent environments
Multi-agent systems are spreading fast in prototypes and experimentation—but consumers still have limited direct touchpoints.
Most multi-agent workflows will reach consumers through one of three routes: standalone AI apps, aggregators/super-apps, chatbot app stores, or AI browsers (desktop and mobile).
In that world, multi-agent apps are typically not the primary surface—they are the content that the surfaces fight over. That said, there is always a possibility that a breakout emerges and becomes the surface itself—WeChat-style—especially if it avoids being packaged as a traditional mobile app that requires App Store distribution.
Ambient Agents and “ambient intelligence”
Today’s agent ecosystem is still mostly workflow agents: agents that execute a defined task, sometimes using computer-use patterns or connectors to external apps and data sources.
Cloud-resident ambient agents—always-on, continuously context-aware—likely need more time. But the direction is clear. As A2A systems evolve into multi-agent and autonomous systems, ambient agents become a plausible dominant delivery model.
If I force myself to imagine a five-year horizon: consumer-facing workflow agents (interacting through phone or ambient devices) and cloud-resident ambient agents may end up at roughly similar scale—within a factor of ~2.
If the shift to ambient agents becomes that large, then the “consumer interface” must change even more dramatically than the desktop-to-mobile transition did.
UI/UX in an Agent World
Assuming the world moves quickly toward Agent-to-Agent (A2A) interaction, what does UI/UX look like?
A working taxonomy of agents
Agents likely segment into a handful of types:
Human Agents
Auto Agents
Workflow Agents
Ambient Agents
Virtual Human Agents
This isn’t a final taxonomy—new terms will emerge and settle (as “ambient agent” already has). Workflow and ambient agents generally assume a multi-agent environment.
You could also segment by interaction role—initiating agents, serving agents (MCP servers), navigators, autonomous agents—but it’s not yet clear that added granularity is practically useful.
UI/UX for “AI apps” and agents
I’ll use “AI app” broadly: anything that executes a function the user wants—mobile/web apps, agents, prompt workflows, etc.
What forms might AI apps take, and how do users reach them?
AI app as a visible surface
Standalone app or URL: a mobile/web app discovered through app stores or the browser.
Inside an AI super-app: accessed via something like /agent-name or @agent-name inside the chatbot.
Through an AI app store directory: GPTs, artifact directories, skills galleries, agent directories embedded in chatbots or AI browsers.
These routes likely cover most auto/workflow agents.
Headless execution
Daemon-like cloud agent: initiated server-side without a direct UI surface.
Initiated through a consumer-facing router agent: Siri/Alexa-like master agent, on-device or cloud-based.
This is the natural pattern for ambient agents.
But headless agents still require a human control surface: a dashboard, feed, confirmations, exception handling—possibly via messenger, a dedicated app, a master agent, or integration into existing super-apps.
Character-like interfaces
Some agents may live as “characters”—profile-pic-like identities, game-avatar metaphors, or persistent voice-mode embodiments. Virtual human agents will likely appear this way, embedded inside other apps and AI experiences.
And beyond these, we should expect many new interaction patterns to emerge.
Voice UX and ambient devices
Voice UX (especially headless + conversational) and ambient device UI (pins, AR glasses) will not behave like mobile/web apps. They may become the next primary consumer surface.
This is why big tech players without existing consumer surface dominance—Meta and OpenAI in particular—are betting hard here. A new tech wave can create new behavioral shifts, and those shifts can create new UI surfaces up for grabs.
A few implications for voice and ambient UX:
The user’s ability to make explicit “choices” via GUI becomes limited or disappears. Think fewer options than an IVR menu.
Voice is temporal—interaction takes time—so instead of presenting many choices, the system will likely optimize for capturing intent and executing quickly.
The classic UX model—guiding users through fragmented GUI actions so they can “find and choose well”—loses power.
Key words here: personalization, defaults, intent.
To make this work, we’ll need new UI/UX patterns—likely agent-native—and a deep personalization data foundation that can reconstruct user workflows end-to-end.
This will not resolve quickly. It’s a multi-year design and iteration problem.
The personalization data layer / agents
A central layer will emerge that either aggregates personal data or routes access to it.
Data sources include: calendars, email, messaging, social graphs, photo libraries, files, and behavioral logs (purchases, services, travel bookings, course consumption, etc.).
Forms could include reactive agents, ambient agents, MCP servers, or explicit data layers.
Key words: personalization (depth and type), privacy, trust, data ownership/sovereignty.
On-device vs. cloud-based
Personalization will likely be hybrid.
On-device
Parts of “my agent” live locally.
On-device capture becomes critical: workflows, screen state changes, user choices, messaging data.
An on-device personalization model is also plausible.
Hybrid pipelines: capture locally, process/store selectively, personalize across layers.
Cloud-based
Backend processing and inference over captured data.
Handling cloud-resident personal data (email, messaging, photos, files) and proactive actions (drafting emails, requesting confirmations for purchases/tasks).
Natural home for ambient agents.
Examples and projects worth noting:
mem0, Context: tool-layer approaches that try to unify personal data.
Apple ReALM: an internal Apple effort to use on-screen context and user reactions to improve Siri through personalization data.
Scribe-style workflow recording: if workflow recording expands into default capture, then recording + vLLMs could become a powerful platform for understanding and personalizing individual workflows—though it lacks the OS-level privilege Apple has.
Aside: if I let my imagination run: this “personalization data layer/agent” category feels like one of the few places a true “second-generation model”—Google-scale—could emerge. Today we still lack clear answers on data capture (OS-level first-party collection is ideal; third parties face constraints) and on how to translate raw data into durable personalization. But Web 1.0 also looked unclear before the two enabling breakthroughs—PageRank and keyword ads—made Google inevitable.
Key takeaways
Agents are spreading into B2C faster than most people in Korea currently feel on the ground. Today, consumer AI experiences still split between native web/apps and “browser-first” distribution (often as Chrome extensions). As AI browsers and chatbot-based agent ecosystems scale, we should expect a rapid migration toward those surfaces.
Native app/web vs. browser/chat-based AI apps have fundamentally different discoverability mechanics, which implies fundamentally different GTM and marketing strategies. Early GPT stores remind me of the early App Store era—when trivial apps could explode because distribution was wide open. That said, as with the early App Store, not all of this is sustainable.
Beyond incumbents like Naver/Kakao, consumer AI players with meaningful DAU/MAU (e.g., WRTN, Liner) may want to explore the emerging “AI app store” opportunity and the personalization layer (agents/data layers). But there are still many open questions: how the market structure settles, how Korea vs. global dynamics evolve, and what becomes truly durable.
Two emerging “truths” seem directionally right so far:
AI apps don’t naturally have strong direct network effects.
Velocity/momentum can be a moat—until it isn’t.
Momentum is often time arbitrage, not inherent defensibility. Some leaders will convert momentum into data moats or indirect network effects, but momentum alone is not durable. Consumer apps currently winning on velocity (including Cluely, WRTN, Liner) need a deliberate strategy to translate that lead into sustainable moats.
Voice UX remains a large opportunity—with real difficulty. Whether Korean-language “locality” becomes a meaningful moat (as it sometimes did in Web 1.0) is still unclear. But there’s also a reverse argument: starting outside the U.S. can force multi-language excellence by default. ElevenLabs starting in Poland is a good reminder that “local constraints” can become global product strength.
In every tech shift, UI surfaces and UX patterns changed massively—and that created enormous opportunity (and many casualties). This cycle will be no different. For incumbents, it’s a threat. For startups, it’s a generational opening—especially in the personalization data layer/agent category.
Call for Startups
The purpose of sharing this thinking is straightforward. As an early-stage investor focused on Consumer + AI, I hope this series helps existing startups better leverage AI-driven shifts—and helps new founders reduce trial-and-error as they search for meaningful opportunities.
In that sense, this is Two Cents’ version of a Call for Startups.
If you are an early-stage founder or startup in Consumer + AI and believe you are onto something, my inbox is always open. Feel free to reach out via DM or email:
hur at hanriverpartners dot com

