The hardest decision in AI product work
Killing a feature is never fun. Killing an AI feature is uniquely painful. There's been more investment in it than the sticker price suggests — months of model selection, prompt iteration, eval suite construction, infrastructure work, and the kind of organizational learning that doesn't show up on a dashboard. There's usually a champion on the team who genuinely believes it's almost there. And there's the lurking suspicion that the next model release might be the one that finally makes it work.
I've been in too many of these conversations. The pattern is consistent: teams keep AI features alive long past the point where the data suggests they should let them go. Sometimes the feature does eventually start working. More often, the team spends another quarter on it and ends up exactly where they would have been if they'd shut it down six months earlier — except now they've also spent six more months of engineering capacity on it.
The teams that handle this well have a clear framework for making the call. Here's what I've seen actually work.
The four warning signs that earn an honest review
A feature isn't necessarily dying because any one of these is happening. But if you see two or three, it's time to schedule a real conversation about whether the project should continue.
Sign 1 — The eval scores plateaued months ago
Every AI feature improves rapidly at first as you iterate on prompts, retrieval, and the eval suite itself. Then improvement slows, and at some point it stops. If your evals have been flat for two or three months despite continued effort, you're not in the early-iteration phase anymore — you've found the natural ceiling for your current approach. The remaining gap to "good enough" is unlikely to close without a fundamentally different approach.
Sign 2 — Users aren't using it the way you expected
Every successful AI feature has a moment where you look at actual user behavior and say "oh, that's what they're doing with it." Sometimes it matches what you designed for. Sometimes it doesn't, but reveals a better use case. Sometimes — and this is the warning sign — usage is just thin. Users tried it, found it didn't fit their workflow, and stopped engaging. No amount of model improvement fixes a feature that doesn't match a real workflow.
Sign 3 — The cost-per-successful-outcome isn't trending in the right direction
If you're spending $X per successful interaction today and you were spending $X six months ago, you have a problem that won't fix itself. The feature might be stable in absolute terms, but every other AI feature in the world is getting cheaper as models get better and infrastructure improves. Standing still is falling behind.
Sign 4 — The team has stopped having new ideas
This one is subtle but reliable. Healthy AI projects generate a constant stream of "what if we tried..." conversations. Stuck projects don't. When the team meeting about the feature has become routine maintenance — same prompt tweaks, same eval re-runs, same incremental fine-tuning — without anyone proposing meaningfully new approaches, the creative phase is over. What's left is grinding, and grinding rarely produces breakthroughs.
The structured decision process
Once you've decided a feature needs an honest review, do it deliberately. Skipping the structure leads to either premature kills (driven by frustration) or indefinite life support (driven by sunk cost). Neither serves the business.
Step 1 — Restate the original goal in writing
Pull up whatever artifact you wrote when you started the feature. What was it supposed to do? Who was it for? What was the success criterion? Write it down at the top of a doc. Most teams find this step illuminating — the original goal often bears little resemblance to what the feature has become, and that gap is itself useful information.
Step 2 — Score the current state honestly
For each of the four warning signs above, give the feature a clear status: green, yellow, red. Use data, not feelings. If you can't get data on one of them, that's also a finding — it means you don't have the observability you need to judge the feature.
Step 3 — Estimate the realistic next phase
If you continue the feature, what's the next quarter actually look like? Not the optimistic version. The realistic version: which engineers, how many weeks, what specific changes, what expected impact based on how well your last few changes worked. If you can't sketch that plan concretely, you don't have a plan — you have hope.
Step 4 — Compare to the alternative
What else could those engineers be working on? This is the question that breaks the sunk-cost paralysis. The decision isn't "kill this feature or save it." It's "spend the next quarter on this feature or spend it on the next-best alternative." Naming that alternative — concretely, with its own expected impact — usually clarifies the decision in a way no amount of analyzing the original feature can.
Step 5 — Set a kill criterion if you continue
If the decision is to continue, write down what would have to be true at the end of the next quarter for you to keep going past that point. Specific numbers, specific dates. If those criteria aren't met, you commit in advance to making the kill call without another debate. This is the single most important part of the process. Without it, you're guaranteed to be having the same conversation in three months.
The hardest part of killing an AI feature isn't the decision. It's separating the decision from the identity of the team that built it. Frame it as "we learned this approach didn't fit," not "this team failed." The skills they built carry forward. The feature doesn't have to.
Where the conversation usually goes wrong
Two failure modes show up over and over:
The slow death. The feature isn't killed, but it stops getting investment. Engineers move to other things. Bugs accumulate. Users notice the quality declining. Eventually it gets killed anyway, but in the worst possible way — silently, with no learnings extracted, leaving users worse off than if it had been retired cleanly. If you're going to kill a feature, kill it cleanly and tell users what they should use instead.
The forever pivot. The feature is "saved" by repeatedly redefining what it's supposed to do. Each new framing is plausible, each one delays the harder conversation. After a year, the feature has been three different things and worked as none of them. Watch for this pattern. It's the most common form of decision avoidance in AI product teams.
The reframe that helps
The most useful mental shift I've seen is this: killing an AI feature is not an admission that AI doesn't work. It's an admission that this particular feature, in its current form, isn't earning its keep. The team's investment in eval infrastructure, in prompt engineering, in model selection — all of that knowledge transfers. The next AI feature you build will be faster and better because of what you learned on this one. Killing the feature lets you collect that dividend on something that has a real chance.
The teams that kill features cleanly tend to ship the most successful AI products over time, because they keep their best engineers working on what's actually working, and they don't drown in zombies. The teams that can't kill features end up too tangled to start anything new.
Make the call. If it's the right call, it'll feel relieving within a week.