Back to all news LANGUAGE MODELS

OpenAI’s gpt-realtime: Enhanced Voice AI Drives Enterprise Adoption

AI Voice OpenAI Realtime API Generative AI Voice AI LLMs NLP

August 28, 2025

Source: VentureBeat AI

Voice Evolution, Not Revolution

Media Hype 8/10

Real Impact 7/10

What is the Viqus Verdict?

We evaluate each news story based on its real impact versus its media hype to offer a clear and objective perspective.

AI Analysis:

While the technical improvements are noteworthy, the intense competitive landscape and the shift towards practical implementation suggest sustained, medium-term impact rather than a revolutionary shift.

Article Summary

OpenAI’s latest voice model, gpt-realtime, is aimed squarely at enterprise applications, particularly those leveraging voice AI. The model emphasizes improved instruction following, achieving a 30.5% score on the MultiChallenge audio benchmark, a significant increase from previous models. Key advancements include ‘more natural and expressive’ voices, the ability to handle complex instructions like speaking with specific accents, and seamless integration with the Realtime API. OpenAI has broadened the API's capabilities, adding support for Session Initiation Protocol (SIP) for contact center use cases, and image input recognition. Furthermore, the model incorporates enhanced function calling, enabling access to external tools, mirroring recent advancements in LLMs. The launch includes new voices – Cedar and Marin – and price reductions of 20% to $32 per million audio input tokens and $64 for audio output tokens. Competition is intensifying, with other providers like ElevenLabs and Hume also offering advanced voice models. The focus on practical, real-world scenarios, as evidenced by demonstrations with T-Mobile and Zillow, highlights the industry’s shift towards tangible applications.

Key Points

OpenAI’s gpt-realtime model prioritizes improved instruction-following capabilities, boosting accuracy and control.
The model offers ‘more natural and expressive’ voices, reflecting advances in AI voice generation technology.
Updates to the Realtime API, including support for SIP and image inputs, expand the model’s applicability across diverse enterprise workflows.

Why It Matters

The rise of gpt-realtime represents a critical step in the maturation of voice AI for business. While demonstrations are promising, the core value lies in the API’s expanded functionality – particularly SIP and the ability to interact with external tools – which is what will ultimately determine if these models transition from impressive prototypes to integral parts of real-world operational systems. This development underscores the increasing investment and competition within the voice AI market, suggesting that sophisticated, adaptable voice solutions are rapidly becoming a necessity for businesses looking to streamline customer interactions and automate workflows. For professional AI stakeholders, this signals a move beyond simple voice assistants to integrated, adaptable AI-powered platforms, demanding a shift in focus towards efficiency, security, and practical implementation.

OpenAI’s gpt-realtime: Enhanced Voice AI Drives Enterprise Adoption

What is the Viqus Verdict?

Article Summary

Key Points

Why It Matters

You might also be interested in

YouTube's AI Age Checks Spark Massive User Backlash

OpenAI to Supply AI Tools to US Federal Workers – With a Political Twist

Salesforce Doubles Down on Realistic AI Testing – Bridging the Demo-to-Reality Gap