Late in 2019, audio/video editing platform Descript acquired the Lyrebird text-to-speech (TTS) startup, depriving Lyrebird users of the service unless they were willing to sign up for a full Descript plan. Descript’s core audience were vloggers and podcasters; video game developers in need of TTS began to look for Lyrebird alternatives.
This occurred at a time when game developers were relying on TTS more than ever. Game engine plug-ins allow designers to prototype scenes with zero latency, streamlining development. These TTS plug-ins also add key accessibility features that broaden audiences considerably. Some studios are even developing AI non-player characters, which rely on runtime TTS for dynamically generated speech.
Together, these events have sent many game developers on a quest for TTS providers beyond Lyrebird. And while Google, Amazon, and Microsoft each have their own TTS divisions, Lyrebird users seem to prefer synthetic-speech specialists over a large-scale enterprise. With that in mind, here’s a look at today’s top independent TTS providers for game developers.
Looking for dynamic, runtime text to speech for PC or mobile games? Contact ReadSpeaker to learn about our TTS game engine plug-ins.
4 Lyrebird Alternatives for TTS in Video Games
Sonantic approaches game developers and filmmakers with a strong offer: custom TTS voices with highly variable emotional expression. Like all the TTS providers on this list, Sonantic creates neural TTS, leveraging the computational power of artificial intelligence to push TTS voices closer and closer to the real thing. However, as of this publication date, it is unclear whether Sonantic’s platform is commercially available; its website doesn’t offer a trial or subscription, instead offering a contact form that promises more information.
Choose any of Replica Studios’ 50ish pre-built character voices to turn your dialog into game-ready speech. Replica’s advertised voices are largely designed for common video game genres like fantasy and science fiction, and the platform allows users to adjust speaking styles to match the scene. Custom voices are available, although the emphasis seems to be on the existing voice library—and these original TTS voices are only available for enterprise plans. Other users can access the Replica voice library for $24 per four hours of credit. These features add up to make Replica Studios a favorite among indie producers and startup studios in English-speaking nations. (English appears to be the only language available for Replica TTS voices.)
Resemble bills itself as a voice cloning service first and foremost. It creates custom TTS voices with four options for providing original voice data: You can record your voice directly onto their web platform, upload audio files, go through the Resemble API, or choose voice actors from an established marketplace. Resemble offers TTS voices in multiple languages and includes a TTS web editor with emotional speaking controls. “Entry” level pricing is $30 per month for up to 100,000 characters of speech with a pre-built voice. You may need to upgrade to a “Build” or “Enterprise” plan to get access to the core voice cloning product.
The old way to add neural TTS to video games was to produce audio files through a TTS engine, then upload those files to the game directory. With ReadSpeaker AI’s game engine plug-ins, developers can skip a few steps. Embedded TTS engines within game files greatly reduces the developer’s time on task by eliminating audio file management. Instead, the ReadSpeaker AI software generates dynamic synthetic speech on the player’s device.
As we publish, ReadSpeaker AI TTS game engine plug-ins are available for the Unity and Unreal game engines; contact us to discuss implementations for other platforms. The advantage of a game engine plug-in solution is that it’s instant; when you connect a TTS engine to your game through an API, latency becomes an issue—and even a single second is too long in a fast-paced gaming environment.
DIY Voice Cloning Tools from Lyrebird Alternatives
Replica Studios and Resemble.ai offer do-it-yourself voice cloning tools, which allow users to generate new TTS voices with their own audio recordings. ReadSpeaker AI produces all-original custom voices in partnership with users, but doesn’t go the DIY route—and it’s because we leverage our Speech Scientists to build a TTS voice that can deliver near-perfect pronunciation.
Any DIY voice cloning tool inevitably relies on limited speech data for training voice models. That often sounds fine for simple utterances, but it can lead to pronunciation problems when the engine’s confronted with long, oddly spelled, or jargony words. ReadSpeaker provides custom pronunciation dictionaries and ongoing support from our team of computational linguists to ensure perfect pronunciation every time, for custom voices or ready-made offerings in more than 30 languages. Sound interesting? Learn more here.