Can an AI Interviewer do Your Concept Testing?


Can an AI Interviewer do
Your Concept Testing?

PUBLISHED

Feb 6, 2026

READ ON

HERBIG.CO

​Dear Reader,​

What happens when you let an AI do the concept testing of design variations? I used Reforge's AI Concept testing* to find that out.

As a prototype, I chose an app idea that lets you document taste notes from brewing specialty coffee. And I need your help exploring how AI-led interviews feel for participants (and improving the app design ☕️). I mostly went with their defaults, which looked good enough for what I wanted to learn from this experiment:

While I wait for your interview results to roll in and share them (and a link to the full prototype) next week, I'm gonna cover more fundamental questions about using AI interviewers for tasks such as generative research and evaluative testing today.

There are a few AI-based interviewer tools out there. You might have read Teresa Torress' take on participating in an AI-led user interview from Anthropic:

Without having participated in the interview, I agree with Teresa's take on the quality and nature of some of the questions she received. Which made me even more curious when Reforge announced an expansion of their existing AI interviewing tool to concept testing:

video preview

In my coaching practice and workshops, I often discuss the pros and cons of leveraging existing discovery workflows with more efficient and sometimes even more effective AI tools. I believe AI interviewers can help scale your research efforts to a certain point when they are based on high-quality inputs and improve over time. What I think the jury is still out on is: What is the impact on the quality of human responses when they know that they are talking to a machine?

But putting that aside for a moment, I think how Reforge markets this capability points to an important distinction that is still valid even for AI-scaled practices: When to use a method or tool to answer which questions for which part of your generative OR evaluative Discovery efforts.

Here's how I view Reforge's positioning of concept testing:

  1. Choosing between strategic directions: Not sure I would agree with the term strategic here. I believe you can choose between design directions. I'm a believer that You can't A/B test your way to product-market fit. And maybe you don't want to bet all your chips on that design or value-prop communication based on (scaled) interview feedback. But you WILL learn which one confuses or delights people more, what they associate this with, or simply which design resonates for what reason.
  2. Testing messaging variations before design investment: Absolutely. Looks like a slam dunk opportunity - as long as the tool keeps things like randomizing the order in which the concept variations are shown in mind (to avoid comparison bias) → take the interview to tell me if the variant order was randomized)
  3. Validating problem-solution fit before prototypes: "Would users actually adopt it?" is a red flag in interview-style research, regardless of scale. Because humans suck at predicting future behavior - especially when it comes to trading time or money for a new solution. Concept testing gives you insights into the fundamental customer sentiments about a visual design or gaps in understanding. But they won't help you predict what will get adopted or bought. To get strong evidence about feature adoption or willingness-to-pay, you need to turn to behavioral methods, not attitudinal methods like interviews (no matter if a human or an AI does them).

To be clear, this is by no means a dunk on Reforge's work. I appreciate them pushing what's possible. I use this merely to explore and share which views and principles are still true and which need updating in the face of more AI-enabled workflows.

I'm excited about the questions, conversations, and possibilities these tools enable for product management practices across the board. I will see you next week for the verdict on how AI-led concept testing feels for interviewees.

*I have no affiliation with Reforge

Thank you for Practicing Product,

Tim

PS.: I chatted with David Pereira about Why Frameworks Fail

Free Webinar: Feb 11 5:00 PM CET, 11:00 AM EST, 8:00 AM PT

Go From "We Need a Strategy" to "Here's Exactly What We're Doing and Why" — In 60 Minutes

Walk away with a pragmatic system to create, facilitate, and communicate a Product Strategy that actually drives decisions across your entire organization.

If you consume one thing this week, make it this...

Synthesizing qual, quant, and strategy with Claude Code + PostHog MCP by Else van der Berg

Two things separate PMs who stare at dashboards from PMs who act:

  1. Interpreting what you see. “Is this normal?” “What does this mean?”
  2. Connecting quant. to qual. What users do vs. what they say

Your quantitative data lives in PostHog (or Amplitude, or Mixpanel). Your qualitative data lives in Notion, Dovetail, or scattered Google Docs. Your deployment history is in Linear. Your knowledge of seasonality patterns lives in your head.

When product analytics tools started launching their in-app AI assistants I was genuinely excited for how they can help solve the first problem. Especially the latest iterations (I’m looking at you, Posthog AI!) are genuinely good at natural language queries.

But none of these in-app AI agents can’t touch the second problem. Because they only see what’s inside their own tool.

This article is about solving both at once: plugging a product analytics MCP into Claude Code, where it can query your data and access your qualitative context in the same conversation.

Who is Tim Herbig?

As a Product Management Coach, I guide Product Teams to measure the real progress of their evidence-informed decisions.

I focus on better practices to connect the dots of Product Strategy, Product OKRs, and Product Discovery.

Product Practice Newsletter

1 tip & 3 resources per week to improve your Strategy, OKRs, and Discovery practices in less than 5 minutes. Explore my new book on realprogressbook.com

Read more from Product Practice Newsletter

Product Practice #407 Your Goal Depends on Another Team — Now What? PUBLISHED May 7, 2026 READ ON HERBIG.CO Dear Reader, Your Key Result says to "Improve Conversion Rate by 7%," but you only control on-site search. You want to drive customer retention, but the marketing team is focused on new acquisition. Most teams respond in one of two ways: they water down the goal until it fits their scope (and lose the ambition), or they keep the big goal and quietly accept they can't move it. Both lead...

Product Practice #406 Why your Company's POV on OKRs matters more than Processes PUBLISHED Apr 30, 2026 READ ON HERBIG.CO Dear Reader, Before I talk with companies about OKR cadences, templates, or tools, I ask them: "What do you expect to change by using OKRs?" The answers take a bit of time. Not because they don't exist, but because OKRs have been treated as a solution without a problem to solve. OKRs haven't been treated as a product, but as a process without purpose. They've chosen a...

Product Practice #405 How to treat Prototypes as Decision-Making Tools PUBLISHED Apr 23, 2026 READ ON HERBIG.CO From Strategy Choice to Planned Experiments in One Workshop In my upcoming live remote workshop From Strategy to Discovery, I take you from sharpening your Product Strategy, to defining leading indicators for measuring progress, all the way to prioritized assumptions you need to derisk end-to-end. Three 4h Live Sessions - Lifetime Material and Recording Access - Real Results Join...