Here’s a bold prediction that might make you rethink the future of AI: the CEO of ElevenLabs believes AI audio models will eventually become as common as smartphones—commoditized, to be precise. But here’s where it gets controversial: why would a company dedicated to building these models openly admit they’ll lose their edge over time? Let’s dive in.
During his keynote at the TechCrunch Disrupt 2025 conference, ElevenLabs co-founder and CEO Mati Staniszewski shared a surprisingly candid outlook on the AI audio landscape. While his company is currently at the forefront of developing cutting-edge models, Staniszewski predicts that these innovations will become standardized within the next few years. This isn’t just a casual observation—it’s a strategic acknowledgment that the AI audio space is evolving faster than most realize.
But why focus on building models if they’re destined to become commoditized? Staniszewski argues that, for now, these models are still the game-changer. Think about it: if an AI voice sounds robotic or unnatural, it’s a deal-breaker for users. ElevenLabs is tackling this problem head-on by refining model architectures, a challenge their researchers have made significant strides in solving. However, Staniszewski admits that in the long run, other players will likely catch up, shrinking the competitive advantage.
And this is the part most people miss: even as commoditization looms, Staniszewski believes there will still be a need for specialized models tailored to specific use cases. For instance, a voice assistant for customer service might require a different model than one used for audiobook narration. The key, he suggests, lies in multi-modal approaches—combining audio with video or large language models (LLMs) to create seamless, conversational experiences. Google’s Veo 3 is a prime example of what’s possible when models are fused together.
ElevenLabs isn’t sitting idle, though. The company plans to forge partnerships and explore open-source technologies to merge its audio expertise with advancements from other models. Staniszewski’s vision is clear: focus on both model-building and applications to create lasting value. He draws a parallel to Apple’s success, stating, ‘The same way software and hardware were the magic for Apple, we think the product and AI will be the magic for the next generation of use cases.’
Here’s the controversial question: If AI audio models are destined to become commoditized, does that mean innovation will stall, or will it democratize access to advanced technology? Staniszewski seems to bet on the latter, but what do you think? Will commoditization level the playing field, or will it stifle creativity? Let’s debate this in the comments—your take could spark the next big idea.
About the author: Sarah has been a TechCrunch reporter since 2011, bringing her I.T. expertise from banking, retail, and software to the tech journalism world. Reach her at sarahp@techcrunch.com or via encrypted message at sarahperez.01 on Signal. View Bio