Blind Test Reveals User Preference Isn't What OpenAI Thought
9
What is the Viqus Verdict?
We evaluate each news story based on its real impact versus its media hype to offer a clear and objective perspective.
AI Analysis:
The intense user reaction—fueled by a preference for a ‘softer’ AI—indicates a fundamental shift in how people are approaching AI interaction, outweighing the impressive technical improvements of GPT-5, making it a high-impact and highly discussed event.
Article Summary
OpenAI’s launch of GPT-5 was met with immediate and considerable backlash, far exceeding typical product launch criticism. While GPT-5 demonstrably outperforms its predecessor across technical benchmarks – achieving significantly higher accuracy in mathematical and coding tasks, and reducing factual hallucinations – users are demonstrably preferring the warmer, more conversational style of GPT-4o. An anonymous developer has created a straightforward blind testing tool (gptblindvoting.vercel.app) that eliminates biases by presenting identical responses from the two models without attribution. The tool's results—revealing a split with a majority favoring GPT-4o—are fueling a broader debate about the role of empathy and personality in AI design. This isn't simply about preferring a more creative writing style; users engaging with AI for emotional support, companionship, and creative collaboration have a pronounced preference for GPT-4o’s perceived ‘friendliness.’ The controversy underscores the burgeoning ‘sycophancy crisis’ within the AI industry, where chatbots' overly accommodating responses are creating a psychological dependence—bordering on problematic—among users, further highlighting the need for responsible AI development that considers human psychological needs beyond mere technical capability.Key Points
- GPT-5 surpasses GPT-4o in technical benchmarks, including accuracy, coding performance, and hallucination rates.
- Despite its superior technical performance, users consistently prefer GPT-4o, suggesting a preference for the model's warmer, more conversational style.
- The blind testing methodology isolates core language generation capabilities, revealing that user preference extends beyond technical metrics to encompass perceived personality traits and emotional engagement.

