Can an AI Interviewer do Your Concept Testing?


Can an AI Interviewer do
Your Concept Testing?

PUBLISHED

Feb 6, 2026

READ ON

HERBIG.CO

​Dear Reader,​

What happens when you let an AI do the concept testing of design variations? I used Reforge's AI Concept testing* to find that out.

As a prototype, I chose an app idea that lets you document taste notes from brewing specialty coffee. And I need your help exploring how AI-led interviews feel for participants (and improving the app design ☕️). I mostly went with their defaults, which looked good enough for what I wanted to learn from this experiment:

While I wait for your interview results to roll in and share them (and a link to the full prototype) next week, I'm gonna cover more fundamental questions about using AI interviewers for tasks such as generative research and evaluative testing today.

There are a few AI-based interviewer tools out there. You might have read Teresa Torress' take on participating in an AI-led user interview from Anthropic:

Without having participated in the interview, I agree with Teresa's take on the quality and nature of some of the questions she received. Which made me even more curious when Reforge announced an expansion of their existing AI interviewing tool to concept testing:

video preview

In my coaching practice and workshops, I often discuss the pros and cons of leveraging existing discovery workflows with more efficient and sometimes even more effective AI tools. I believe AI interviewers can help scale your research efforts to a certain point when they are based on high-quality inputs and improve over time. What I think the jury is still out on is: What is the impact on the quality of human responses when they know that they are talking to a machine?

But putting that aside for a moment, I think how Reforge markets this capability points to an important distinction that is still valid even for AI-scaled practices: When to use a method or tool to answer which questions for which part of your generative OR evaluative Discovery efforts.

Here's how I view Reforge's positioning of concept testing:

  1. Choosing between strategic directions: Not sure I would agree with the term strategic here. I believe you can choose between design directions. I'm a believer that You can't A/B test your way to product-market fit. And maybe you don't want to bet all your chips on that design or value-prop communication based on (scaled) interview feedback. But you WILL learn which one confuses or delights people more, what they associate this with, or simply which design resonates for what reason.
  2. Testing messaging variations before design investment: Absolutely. Looks like a slam dunk opportunity - as long as the tool keeps things like randomizing the order in which the concept variations are shown in mind (to avoid comparison bias) → take the interview to tell me if the variant order was randomized)
  3. Validating problem-solution fit before prototypes: "Would users actually adopt it?" is a red flag in interview-style research, regardless of scale. Because humans suck at predicting future behavior - especially when it comes to trading time or money for a new solution. Concept testing gives you insights into the fundamental customer sentiments about a visual design or gaps in understanding. But they won't help you predict what will get adopted or bought. To get strong evidence about feature adoption or willingness-to-pay, you need to turn to behavioral methods, not attitudinal methods like interviews (no matter if a human or an AI does them).

To be clear, this is by no means a dunk on Reforge's work. I appreciate them pushing what's possible. I use this merely to explore and share which views and principles are still true and which need updating in the face of more AI-enabled workflows.

I'm excited about the questions, conversations, and possibilities these tools enable for product management practices across the board. I will see you next week for the verdict on how AI-led concept testing feels for interviewees.

*I have no affiliation with Reforge

Thank you for Practicing Product,

Tim

PS.: I chatted with David Pereira about Why Frameworks Fail

Free Webinar: Feb 11 5:00 PM CET, 11:00 AM EST, 8:00 AM PT

Go From "We Need a Strategy" to "Here's Exactly What We're Doing and Why" — In 60 Minutes

Walk away with a pragmatic system to create, facilitate, and communicate a Product Strategy that actually drives decisions across your entire organization.

If you consume one thing this week, make it this...

Synthesizing qual, quant, and strategy with Claude Code + PostHog MCP by Else van der Berg

Two things separate PMs who stare at dashboards from PMs who act:

  1. Interpreting what you see. “Is this normal?” “What does this mean?”
  2. Connecting quant. to qual. What users do vs. what they say

Your quantitative data lives in PostHog (or Amplitude, or Mixpanel). Your qualitative data lives in Notion, Dovetail, or scattered Google Docs. Your deployment history is in Linear. Your knowledge of seasonality patterns lives in your head.

When product analytics tools started launching their in-app AI assistants I was genuinely excited for how they can help solve the first problem. Especially the latest iterations (I’m looking at you, Posthog AI!) are genuinely good at natural language queries.

But none of these in-app AI agents can’t touch the second problem. Because they only see what’s inside their own tool.

This article is about solving both at once: plugging a product analytics MCP into Claude Code, where it can query your data and access your qualitative context in the same conversation.

Who is Tim Herbig?

As a Product Management Coach, I guide Product Teams to measure the real progress of their evidence-informed decisions.

I focus on better practices to connect the dots of Product Strategy, Product OKRs, and Product Discovery.

Product Practice Newsletter

1 tip & 3 resources per week to improve your Strategy, OKRs, and Discovery practices in less than 5 minutes. Explore my new book on realprogressbook.com

Read more from Product Practice Newsletter

Product Practice #400 Get your North Star Metric Reviewed by me PUBLISHED Mar 19, 2026 READ ON HERBIG.CO Dear Reader, To celebrate the 400th edition of this newsletter (🥳), I thought, why not try something different: Share your current North Star Metric and some high-level context with me through the form below, and I'll send you a personalized review video - for FREE and without AI processing. Just me, in front of a camera, sharing my thoughts on your North Star Metric. There's no hook,...

Product Practice #399 How to Connect Strategy,Goals and Discovery PUBLISHED Mar 12, 2026 READ ON HERBIG.CO Most product teams don't have a strategy or OKR problem. They have a connection problem. My new Progress Wheel Intensive is a full-day working session for ambitious product teams where we fix that together — using your actual product context, not hypotheticals. Book a call to talk about it. Dear Reader, Often, the core concepts I share on how teams can make real progress by connecting...

Product Practice #398 The AI-Assisted ProductDiscovery Cheatsheet PUBLISHED Mar 6, 2026 READ ON HERBIG.CO Dear Reader, I'll cut to the chase: A few weeks ago, I thought about synthesizing high-level guidance on what stays the same and what needs to change in Product Discovery with AI - beyond the ever-changing tool details. So, I reached out to one of my favorite Product AI-thinkers right now, Julia Bastian (who's knowledgeable, pragmatic, and practical), and we compiled this graphic for you....