Can an AI Interviewer do Your Concept Testing?


Can an AI Interviewer do
Your Concept Testing?

PUBLISHED

Feb 6, 2026

READ ON

HERBIG.CO

​Dear Reader,​

What happens when you let an AI do the concept testing of design variations? I used Reforge's AI Concept testing* to find that out.

As a prototype, I chose an app idea that lets you document taste notes from brewing specialty coffee. And I need your help exploring how AI-led interviews feel for participants (and improving the app design ☕️). I mostly went with their defaults, which looked good enough for what I wanted to learn from this experiment:

While I wait for your interview results to roll in and share them (and a link to the full prototype) next week, I'm gonna cover more fundamental questions about using AI interviewers for tasks such as generative research and evaluative testing today.

There are a few AI-based interviewer tools out there. You might have read Teresa Torress' take on participating in an AI-led user interview from Anthropic:

Without having participated in the interview, I agree with Teresa's take on the quality and nature of some of the questions she received. Which made me even more curious when Reforge announced an expansion of their existing AI interviewing tool to concept testing:

video preview

In my coaching practice and workshops, I often discuss the pros and cons of leveraging existing discovery workflows with more efficient and sometimes even more effective AI tools. I believe AI interviewers can help scale your research efforts to a certain point when they are based on high-quality inputs and improve over time. What I think the jury is still out on is: What is the impact on the quality of human responses when they know that they are talking to a machine?

But putting that aside for a moment, I think how Reforge markets this capability points to an important distinction that is still valid even for AI-scaled practices: When to use a method or tool to answer which questions for which part of your generative OR evaluative Discovery efforts.

Here's how I view Reforge's positioning of concept testing:

  1. Choosing between strategic directions: Not sure I would agree with the term strategic here. I believe you can choose between design directions. I'm a believer that You can't A/B test your way to product-market fit. And maybe you don't want to bet all your chips on that design or value-prop communication based on (scaled) interview feedback. But you WILL learn which one confuses or delights people more, what they associate this with, or simply which design resonates for what reason.
  2. Testing messaging variations before design investment: Absolutely. Looks like a slam dunk opportunity - as long as the tool keeps things like randomizing the order in which the concept variations are shown in mind (to avoid comparison bias) → take the interview to tell me if the variant order was randomized)
  3. Validating problem-solution fit before prototypes: "Would users actually adopt it?" is a red flag in interview-style research, regardless of scale. Because humans suck at predicting future behavior - especially when it comes to trading time or money for a new solution. Concept testing gives you insights into the fundamental customer sentiments about a visual design or gaps in understanding. But they won't help you predict what will get adopted or bought. To get strong evidence about feature adoption or willingness-to-pay, you need to turn to behavioral methods, not attitudinal methods like interviews (no matter if a human or an AI does them).

To be clear, this is by no means a dunk on Reforge's work. I appreciate them pushing what's possible. I use this merely to explore and share which views and principles are still true and which need updating in the face of more AI-enabled workflows.

I'm excited about the questions, conversations, and possibilities these tools enable for product management practices across the board. I will see you next week for the verdict on how AI-led concept testing feels for interviewees.

*I have no affiliation with Reforge

Thank you for Practicing Product,

Tim

PS.: I chatted with David Pereira about Why Frameworks Fail

Free Webinar: Feb 11 5:00 PM CET, 11:00 AM EST, 8:00 AM PT

Go From "We Need a Strategy" to "Here's Exactly What We're Doing and Why" — In 60 Minutes

Walk away with a pragmatic system to create, facilitate, and communicate a Product Strategy that actually drives decisions across your entire organization.

If you consume one thing this week, make it this...

Synthesizing qual, quant, and strategy with Claude Code + PostHog MCP by Else van der Berg

Two things separate PMs who stare at dashboards from PMs who act:

  1. Interpreting what you see. “Is this normal?” “What does this mean?”
  2. Connecting quant. to qual. What users do vs. what they say

Your quantitative data lives in PostHog (or Amplitude, or Mixpanel). Your qualitative data lives in Notion, Dovetail, or scattered Google Docs. Your deployment history is in Linear. Your knowledge of seasonality patterns lives in your head.

When product analytics tools started launching their in-app AI assistants I was genuinely excited for how they can help solve the first problem. Especially the latest iterations (I’m looking at you, Posthog AI!) are genuinely good at natural language queries.

But none of these in-app AI agents can’t touch the second problem. Because they only see what’s inside their own tool.

This article is about solving both at once: plugging a product analytics MCP into Claude Code, where it can query your data and access your qualitative context in the same conversation.

Who is Tim Herbig?

As a Product Management Coach, I guide Product Teams to measure the real progress of their evidence-informed decisions.

I focus on better practices to connect the dots of Product Strategy, Product OKRs, and Product Discovery.

Product Practice Newsletter

1 tip & 3 resources per week to improve your Strategy, OKRs, and Discovery practices in less than 5 minutes. Explore my new book on realprogressbook.com

Read more from Product Practice Newsletter

Product Practice #397 3 Things to Put into YourNext Strategy Document PUBLISHED Feb 27, 2026 READ ON HERBIG.CO Dear Reader, The most effective strategy document I've seen doesn't worry about the looks or format. Whether it's a scrappy Google Doc or a fancy Miro template, what matters is the quality and cohesiveness of the information it contains. Make sure what you cover aligns with your company's expected standards to ensure stakeholder understanding and, consequently, buy-in. But make sure...

Hallo liebe:r Leser:in, English Translation below for internal forwarding to your German colleagues Du lieferst Features aus und wirst nach KPIs gefragt – ohne Verbindung zu Erfolg für Nutzer:innen und Geschäft. Die Strategie deines Unternehmens ist entweder zu vage oder fehlt ganz. Das Ergebnis: Alibi Progress statt echter Wirkung.In meinem Workshop "Strategische Umsetzung statt KPIs abarbeiten – Entwicklung & Messung von Produktstrategie am 4. Mai im Rahmen der Product Owner Days 2026...

Product Practice #396 MECE: Double the Usefulnessof Your Metrics Trees PUBLISHED Feb 19, 2026 READ ON HERBIG.CO Dear Reader, Many resources say your metrics trees need to be "MECE." But how do you do it? MECE stands for: Mutually Exclusive Collectively Exhaustive In the context of metrics trees, this means mapping the individual drivers of an overarching goal in a way that allows us to identify and improve domain-specific levers through selective focus, while creating holistic...