Reimagining Search with Generative AI
Designing a conversational search experience that redefines how customers find and play audio.
The Problem
"Alexa Speak," limited visuals, rising expectations
Customers had to use precise voice commands to find audio, with minimal visual feedback. LLMs like ChatGPT changed expectations: people wanted natural conversation. I needed to design a visual layer that complemented voice, enabling conversational audio discovery without slowing customers down.
"It’s really good when I know exactly what I want and what to say, but when I don’t, it’s terrible.”
Long-time Alexa Customer, Music & Podcast research study
The solution
Where Voice Meets Visual: Reimagined Audio Discovery
Visuals that complement the voice experience and help customers learn about artists and discover content seamlessly. I bridged natural language and touch, built a framework that unified domain and conversational surfaces, and introduced adaptive card designs optimized for long titles and varying metadata.
Design system additions
Scalable card system across all content types
Built flexible components that adapt to different content needs, from quick song selection to considered podcast browsing. Systematic design saved 52 dev hours while improving browsability.
engagement & Discovery opportunities
From noise to value: a framework for effective related content
I created a framework to surface only hyper-relevant related results. When a partner team misapplied the model and overloaded results with low-value content, we collaborated to refine the rules and ensure quality. This change contributed to a 67% engagement increase for conversational recommendations.
enhancing discovery & engagement
Smart suggestions that extend the conversation
Customers didn't know what to ask the new Alexa. Data-driven suggestion pills increased engagement by surfacing relevant follow-ups for both voice and tap interaction.
optimized across platforms
Unified experience across mobile, TV, and web
Adapted interaction patterns for each platform's strengths while maintaining consistency. The result: 4x increase in conversational audio discovery on TV.
Impact
Engagement, Efficiency, & Recognition
The new experience increased listening engagement, improved perceived defect rating, and saved development time through reusable components. It was featured at the Alexa+ launch event and set the foundation for future multimodal generative experiences.
"Alexa knocked it out of the park. Connected me with a podcast I didn't know existed and is right up my interest spectrum. I started playing it on the spot!”
participant 6, in-person testing
Design Approach
Key insights, decisions, and trade-offs that shaped the final solution
early exploration
Exploring Agent Presentation Models
Agent as Co-Actor
Acts within and augments existing UI
Easiest adaptation & most feasible
Agent as guide
Chat UI in deterministic location
Agent is the UI
Everything occurs in "agent space"
Influenced and collaborated with horizontal teams to help deliver model that balanced feasibility and customer familiarity while paving the way for richer generative interactions.
Synthesizing research
Mapping search mindsets
Transactional
Most common (~70% of cases)
Conversational
More common with Alexa+
Open & Conversational
Least common
Used primary and secondary research to understand customer mindsets, goals, perception of success, and effort tolerance while searching for audio content. Used to identify core actions customers take on content.
early iterative collaboration
Principles that Shaped Early Exploration
Early wires grounded in first principles for multi-modality design as well as primary research. Created vignettes to gain early alignment across product and engineering.
“I’d rather look through large sets of results on a search results page, but that page also is not super magical.”
participant 3, in-person interviews
Principle 1
Simplify the GUI to complement voice
Visuals clarify results and aid discovery. Summarized voice response and limited set of personalized results minimizes cognitive load.
Principle 2
Conversation is not the end goal
Visuals help customers complete goals faster, offer fallback control, and support quick, intuitive touch interactions like deep browsing.

























