I’ve seen this play out so many times.

Teams newer to building AI features show a huge amount of resistance when they’re told they need to hand-label data or manually inspect errors.

They drag their feet, try other approaches to improve the product first (tweak the modeling, rewrite the prompts, add more data), anything but that.

It feels backwards: isn’t the whole point of AI to avoid tedious manual work?

The truth is, these are some of the highest-ROI activities you can do, and yes, they are manual! They’re how you learn where your system is failing, where your definitions are fuzzy, and what “good” actually means.

And yet, teams still resist it:

Comeback #1: “We don’t have time for manual work.” Truth: Without this step your non-manual work will be a waste. This is how you resolve your AI product failure modes.

Comeback #2: “We can automate labeling.” Truth: You can only automate after you’ve defined a ground truth. Otherwise, you just scale your confusion.

Comeback #3: “It’s just tedious grunt work.” Truth: It’s where your product understanding deepens. Manual review is where the team’s mental model of “good” gets built.

I’ve yet to see a great AI product come together without someone, at some point, getting their hands dirty with the data: labeling, inspecting, noticing.

You have to define what you’re after first, and that’s not something you can automate.