Jul 6, 2023

Speech to Speech Model Enhancements Are Live

We’re thrilled to announce an exciting upgrade to our Speech-to-Speech engine. This revolutionary technology by Resemble AI has always provided human-like vocalization to the worlds of gaming, film, conversational AI, and beyond. Now, we’ve just made it better.

Our recent Speech-to-Speech model enhancement addresses four key areas to provide you with a more seamless, accurate, and immersive voice conversion experience. These are:

  1. Audio Quality: We’ve significantly enhanced the audio quality of our engine. You’ll find the voices to be full 48khz, clearer, and more engaging, ensuring every dialogue and narration captivates the listener and stays true to the emotion behind every word.
  2. Accuracy: Every word counts, and getting them right is critical. Our newly improved Speech-to-Speech engine now boasts even better accuracy, so every subtlety of the original speech is captured and reproduced without missing a beat.
  3. Intonation: The tone of voice can change the meaning of a sentence. We’ve further refined our engine to better understand and mimic human intonation. Whether it’s the inflection at the end of a question, the emphasis on a particular word, or the rhythm of a sentence, our upgraded engine gets it spot-on.

The upgrade works seamlessly with our Text-to-Speech system, allowing you to create unique human-like vocalizations without compromising automation, quality, or speed.


Where Speech to Speech Makes An Impact

With over 3 million minutes of audio generated every month, we’ve witnessed our users create amazing experiences utilizing Speech-to-Speech. Here are some examples:

Game Dialogue: Crafting immersive game dialogues is now easier and more engaging. The enhanced accuracy and intonation offers more believable character voices, making your game experience even more realistic.

Advertisements: Personalized audio ads can now be created with an even better touch of human-like nuances, from accents to emotional inflections. This means your ads can resonate more deeply with your audience, making them more effective.

Film Dialogue: Be it a documentary narration, voiceovers, or ADR, the enhanced engine ensures that every line is delivered with accuracy and emotional depth. Your audience will be hooked on every word.

Real-time Speech-to-Speech via API: Developers, fear not! Our updated Speech-to-Speech engine is ready-to-use for your applications, and you can rapidly build production-ready integrations with our modern tools for an immersive, real-time, low latency Voice Conversion experience.

We understand that with the power of AI voice generation comes the responsibility of ensuring data security and authenticity. As the landscape of AI-generated voices grows, we recognize the potential risks associated with deepfake voice and audio manipulation. To combat these concerns and to further ensure the safety and authenticity of our customer’s voice AI data, we continue to upgrade and refine our deep neural network watermarker, PerTh.

We’re not just delivering a better voice conversion experience; we’re ensuring that this experience is safe, secure, and trustworthy. We remain committed to maintaining the integrity of voice AI-generated content, and we invite you to experience this blend of innovation, quality, and security that our updated Speech-to-Speech engine offers. Sign up today to get started.

