Can an AI Interviewer do Your Concept Testing?


Can an AI Interviewer do
Your Concept Testing?

PUBLISHED

Feb 6, 2026

READ ON

HERBIG.CO

​Dear Reader,​

What happens when you let an AI do the concept testing of design variations? I used Reforge's AI Concept testing* to find that out.

As a prototype, I chose an app idea that lets you document taste notes from brewing specialty coffee. And I need your help exploring how AI-led interviews feel for participants (and improving the app design ☕️). I mostly went with their defaults, which looked good enough for what I wanted to learn from this experiment:

While I wait for your interview results to roll in and share them (and a link to the full prototype) next week, I'm gonna cover more fundamental questions about using AI interviewers for tasks such as generative research and evaluative testing today.

There are a few AI-based interviewer tools out there. You might have read Teresa Torress' take on participating in an AI-led user interview from Anthropic:

Without having participated in the interview, I agree with Teresa's take on the quality and nature of some of the questions she received. Which made me even more curious when Reforge announced an expansion of their existing AI interviewing tool to concept testing:

video preview

In my coaching practice and workshops, I often discuss the pros and cons of leveraging existing discovery workflows with more efficient and sometimes even more effective AI tools. I believe AI interviewers can help scale your research efforts to a certain point when they are based on high-quality inputs and improve over time. What I think the jury is still out on is: What is the impact on the quality of human responses when they know that they are talking to a machine?

But putting that aside for a moment, I think how Reforge markets this capability points to an important distinction that is still valid even for AI-scaled practices: When to use a method or tool to answer which questions for which part of your generative OR evaluative Discovery efforts.

Here's how I view Reforge's positioning of concept testing:

  1. Choosing between strategic directions: Not sure I would agree with the term strategic here. I believe you can choose between design directions. I'm a believer that You can't A/B test your way to product-market fit. And maybe you don't want to bet all your chips on that design or value-prop communication based on (scaled) interview feedback. But you WILL learn which one confuses or delights people more, what they associate this with, or simply which design resonates for what reason.
  2. Testing messaging variations before design investment: Absolutely. Looks like a slam dunk opportunity - as long as the tool keeps things like randomizing the order in which the concept variations are shown in mind (to avoid comparison bias) → take the interview to tell me if the variant order was randomized)
  3. Validating problem-solution fit before prototypes: "Would users actually adopt it?" is a red flag in interview-style research, regardless of scale. Because humans suck at predicting future behavior - especially when it comes to trading time or money for a new solution. Concept testing gives you insights into the fundamental customer sentiments about a visual design or gaps in understanding. But they won't help you predict what will get adopted or bought. To get strong evidence about feature adoption or willingness-to-pay, you need to turn to behavioral methods, not attitudinal methods like interviews (no matter if a human or an AI does them).

To be clear, this is by no means a dunk on Reforge's work. I appreciate them pushing what's possible. I use this merely to explore and share which views and principles are still true and which need updating in the face of more AI-enabled workflows.

I'm excited about the questions, conversations, and possibilities these tools enable for product management practices across the board. I will see you next week for the verdict on how AI-led concept testing feels for interviewees.

*I have no affiliation with Reforge

Thank you for Practicing Product,

Tim

PS.: I chatted with David Pereira about Why Frameworks Fail

Free Webinar: Feb 11 5:00 PM CET, 11:00 AM EST, 8:00 AM PT

Go From "We Need a Strategy" to "Here's Exactly What We're Doing and Why" — In 60 Minutes

Walk away with a pragmatic system to create, facilitate, and communicate a Product Strategy that actually drives decisions across your entire organization.

If you consume one thing this week, make it this...

Synthesizing qual, quant, and strategy with Claude Code + PostHog MCP by Else van der Berg

Two things separate PMs who stare at dashboards from PMs who act:

  1. Interpreting what you see. “Is this normal?” “What does this mean?”
  2. Connecting quant. to qual. What users do vs. what they say

Your quantitative data lives in PostHog (or Amplitude, or Mixpanel). Your qualitative data lives in Notion, Dovetail, or scattered Google Docs. Your deployment history is in Linear. Your knowledge of seasonality patterns lives in your head.

When product analytics tools started launching their in-app AI assistants I was genuinely excited for how they can help solve the first problem. Especially the latest iterations (I’m looking at you, Posthog AI!) are genuinely good at natural language queries.

But none of these in-app AI agents can’t touch the second problem. Because they only see what’s inside their own tool.

This article is about solving both at once: plugging a product analytics MCP into Claude Code, where it can query your data and access your qualitative context in the same conversation.

Who is Tim Herbig?

As a Product Management Coach, I guide Product Teams to measure the real progress of their evidence-informed decisions.

I focus on better practices to connect the dots of Product Strategy, Product OKRs, and Product Discovery.

Product Practice Newsletter

1 tip & 3 resources per week to improve your Strategy, OKRs, and Discovery practices in less than 5 minutes. Explore my new book on realprogressbook.com

Read more from Product Practice Newsletter

Product Practice #393 How much Money willthis Feature make? PUBLISHED Jan 30, 2026 READ ON HERBIG.CO I'm hosting my first free live webinar in 2026: Go From "We Need a Strategy" to "Here's Exactly What We're Doing and Why" — In 60 Minutes on Feb 11 at 5:00 PM CET (11:00 AM EST | 8:00 AM PT). Bring all your questions, and I will work through them on the spot. Note that there will be no recording - this is about hands-on interaction, not just information transmission. CLAIM YOUR FREE SPOT Dear...

Product Practice #392 From Product Jargonto Plain English READ ON HERBIG.CO PUBLISHED Jan 23, 2026 READING TIME 3 min & 29 sec Dear Reader,I often feel that, somewhere along the way, we, as an industry, started optimizing for sounding like product people instead of speaking like humans in plain English (or any other language). What would happen if we dropped the product lingo and used plain English to describe what's needed? I find it increasingly liberating for product teams to describe...

Product Practice #391 Four Pragmatic Ways to Improve Opportunity Solution Trees in Practice READ ON HERBIG.CO PUBLISHED Jan 16, 2025 READING TIME 3 min & 56 sec Dear Reader, Opportunity Solution Trees (OSTs) are a widely popular visual aid for connecting solution space work to business goals through problem space elements (similar to Impact Mapping). From seeing the way product teams adopt them in practice, here are four ways I've seen improve their impact for your work: Remember that OSTs...