Reimagining Search with Generative AI

Designing a conversational search experience that redefines how customers find and play audio.

my role

UX design lead for Alexa Audio

In collaboration with a voice designer, I led multimodal design for Alexa Audio’s conversational search across mobile, web, smart speakers, and TV. I shaped the early interaction model for conversational discovery and built a multimodal framework that bridged natural language, touch, and existing interfaces, increasing usability and engagement. I partnered closely with product and engineering teams to define and build the launch experience.

platforms

Mobile, Smart Speakers, TV

timeline

2 years • 2023-25

key contributions
  • UX Strategy & Design

  • User Research

  • Multimodal flows

  • Design-system management

  • Engineering Handoff

my role

UX design lead for Alexa Audio

In collaboration with a voice designer, I led multimodal design for Alexa Audio’s conversational search across mobile, web, smart speakers, and TV. I shaped the early interaction model for conversational discovery and built a multimodal framework that bridged natural language, touch, and existing interfaces, increasing usability and engagement. I partnered closely with product and engineering teams to define and build the launch experience.

platforms

Mobile, Smart Speakers, TV

timeline

2 years • 2023-25

key contributions
  • UX Strategy & Design

  • User Research

  • Multimodal flows

  • Design-system management

  • Engineering Handoff

my role

UX design lead for Alexa Audio

In collaboration with a voice designer, I led multimodal design for Alexa Audio’s conversational search across mobile, web, smart speakers, and TV. I shaped the early interaction model for conversational discovery and built a multimodal framework that bridged natural language, touch, and existing interfaces, increasing usability and engagement. I partnered closely with product and engineering teams to define and build the launch experience.

platforms

Mobile, Smart Speakers, TV

timeline

2 years • 2023-25

key contributions
  • UX Strategy & Design

  • User Research

  • Multimodal flows

  • Design-system management

  • Engineering Handoff

The Problem

"Alexa Speak," limited visuals, rising expectations

Customers had to use precise voice commands to find audio, with minimal visual feedback. LLMs like ChatGPT changed expectations: people wanted natural conversation. I needed to design a visual layer that complemented voice, enabling conversational audio discovery without slowing customers down.

"It’s really good when I know exactly what I want and what to say, but when I don’t, it’s terrible.”

Long-time Alexa Customer, Music & Podcast research study
The solution

Where Voice Meets Visual: Reimagined Audio Discovery

Visuals that complement the voice experience and help customers learn about artists and discover content seamlessly. I bridged natural language and touch, built a framework that unified domain and conversational surfaces, and introduced adaptive card designs optimized for long titles and varying metadata.

intelligent routing

Conversational interface that’s there when you need it

Conversational queries surface rich visual results, enabling discovery through browsable cards, driving a 20% engagement lift.

enabling quick actions

And fades into the background when you don’t

Direct playback when you know what you want. Conversation isn't the goal. Protected listening minutes while improving task success.

intelligent routing

Conversational interface that’s there when you need it

Conversational queries surface rich visual results, enabling discovery through browsable cards, driving a 20% engagement lift.

enabling quick actions

And fades into the background when you don’t

Direct playback when you know what you want. Conversation isn't the goal. Protected listening minutes while improving task success.

Design system additions

Scalable card system across all content types

Built flexible components that adapt to different content needs, from quick song selection to considered podcast browsing. Systematic design saved 52 dev hours while improving browsability.

engagement & Discovery opportunities

From noise to value: a framework for effective related content

I created a framework to surface only hyper-relevant related results. When a partner team misapplied the model and overloaded results with low-value content, we collaborated to refine the rules and ensure quality. This change contributed to a 67% engagement increase for conversational recommendations.

Before: Limited Discovery

Search responses surfaced only the primary item, limiting exploration and discovery.

Before: Limited Discovery

Search responses surfaced only the primary item, limiting exploration and discovery.

Before: Limited Discovery

Search responses surfaced only the primary item, limiting exploration and discovery.

Misapplied: Too much noise

Partner team overloaded responses with tangential, low-value items, reducing trust.

Misapplied: Too much noise

Partner team overloaded responses with tangential, low-value items, reducing trust.

Misapplied: Too much noise

Partner team overloaded responses with tangential, low-value items, reducing trust.

Refined: Hyper-relevant content

Applied strict quality filters to surface fewer, higher value items.

Refined: Hyper-relevant content

Applied strict quality filters to surface fewer, higher value items.

Refined: Hyper-relevant content

Applied strict quality filters to surface fewer, higher value items.

enhancing discovery & engagement

Smart suggestions that extend the conversation

Customers didn't know what to ask the new Alexa. Data-driven suggestion pills increased engagement by surfacing relevant follow-ups for both voice and tap interaction.

optimized across platforms

Unified experience across mobile, TV, and web

Adapted interaction patterns for each platform's strengths while maintaining consistency. The result: 4x increase in conversational audio discovery on TV.

Impact

Engagement, Efficiency, & Recognition

The new experience increased listening engagement, improved perceived defect rating, and saved development time through reusable components. It was featured at the Alexa+ launch event and set the foundation for future multimodal generative experiences.

"Alexa knocked it out of the park. Connected me with a podcast I didn't know existed and is right up my interest spectrum. I started playing it on the spot!”

participant 6, in-person testing
Engagement

0%

0%

increase in listening hours

Engagement

0%

0%

increase in listening hours

Engagement

0%

0%

increase in listening hours

Usability

0%

0%

below CPDR target

Usability

0%

0%

below CPDR target

Usability

0%

0%

below CPDR target

Resource savings

0

0

weeks

dev time saved w/ design system

Resource savings

0

0

weeks

dev time saved w/ design system

Resource savings

0

0

weeks

dev time saved w/ design system

showcase

0

0

feature

showcased at Alexa+ launch event

showcase

0

0

feature

showcased at Alexa+ launch event

showcase

0

0

feature

showcased at Alexa+ launch event

Design Approach

Key insights, decisions, and trade-offs that shaped the final solution

early exploration

Exploring Agent Presentation Models

Agent as Co-Actor
Acts within and augments existing UI

Easiest adaptation & most feasible

Agent as guide
Chat UI in deterministic location

Agent is the UI
Everything occurs in "agent space"

Influenced and collaborated with horizontal teams to help deliver model that balanced feasibility and customer familiarity while paving the way for richer generative interactions.

Synthesizing research

Mapping search mindsets

Transactional
Most common (~70% of cases)

Conversational
More common with Alexa+

Open & Conversational
Least common

Used primary and secondary research to understand customer mindsets, goals, perception of success, and effort tolerance while searching for audio content. Used to identify core actions customers take on content.

early iterative collaboration

Principles that Shaped Early Exploration

Early wires grounded in first principles for multi-modality design as well as primary research. Created vignettes to gain early alignment across product and engineering.

“I’d rather look through large sets of results on a search results page, but that page also is not super magical.”

participant 3, in-person interviews
Principle 1

Simplify the GUI to complement voice

Visuals clarify results and aid discovery. Summarized voice response and limited set of personalized results minimizes cognitive load.

Principle 2

Conversation is not the end goal

Visuals help customers complete goals faster, offer fallback control, and support quick, intuitive touch interactions like deep browsing.