This past month, the team at Resemble has been making continued progress in developing new products and upgrading existing ones. We continue to push the boundaries on voice naturalness and quality while remaining vigilant in our development of AI security products. Over the course of July, we’ve added new voices and revamped our Voice Marketplace, released an upgraded Speech-to-Speech model, launched our Deepfake Detector along with making improvements to our API. Let’s explore each one of these updates in more detail and how they provide value to our customers.
Resemble Voice Marketplace Updates: New AI Voices Added
In an ongoing effort to make natural-sounding AI voices readily accessible to our customer base, we’ve added new additions to our Voice Marketplace. The team focused on three primary variables during this update; diversity of voices, voice quality, and consistency. The marketplace currently supports over 40+ voices, catering to various projects, from gaming to interactive voice response (IVR) systems. To further diversify the AI voices, we’ve added marketplace voices with accents. The voices are also capable of being localized in up to 62 languages. Each AI voice has an unmatched naturalness and its AI-generated content has a high level of consistency across multiple projects. Customers also have the ability to generate audio content through both Text-to-Speech (TTS) and Speech-to-Speech (STS) synthesis to accommodate any project.
Speech-to-Speech: AI Voices That Capture Human Emotion
The realism of our Speech-to-Speech generated content has reached another high point with our newest model. The model is able to accurately capture the input speech during training to significantly enhance speech synthesis. The recent update address four key areas:
1. Audio Quality: Our STS engine now offers significantly improved audio quality, with voices that are clearer, more engaging, and at a full 48kHz, ensuring captivating dialogues and narrations.
2. Accuracy: The enhanced model accurately captures every subtlety of the input speech allowing the synthesized content to capture the true emotions of human speech.
3. Intonation: Understanding and replicating human intonation is crucial, and we’ve refined our engine to achieve just that. Our new model accurately inflects questions, emphasizes specific words, and follows natural rhythms.
4. Integration with Text-to-Speech: Our Speech-to-Speech system seamlessly integrates with our Text-to-Speech system, enabling you to create unique, human-like vocalizations without compromising automation, quality, or speed.
Again, our Voice Marketplace voices are enabled for STS and TTS synthesis. Developers can also utilize our real-time Speech-to-Speech API to create low-latency voice conversion experiences. Consider cloning a new voice here to test out the new model.
AI Security: Resemble Detect, Deepfake Detector
The proliferation of deepfake content has highlighted the need for data security in the generative AI space. This has lead our team to develop Resemble Detect, a cutting-edge neural network deepfake detector that we released earlier this month. Detect is designed to protect your voice content from AI fraud. It works by analyzing the time-frequency data of audio clips, differentiating between real and fake audio files with up to 98% accuracy. By leveraging cutting-edge neural networks and deep learning models, Detect identifies the most subtle sound artifacts that distinguish real from manipulated audio.
The potential applications of Detect are vast, from combatting deepfake songs in the music industry to preventing unauthorized deepfake content in political campaigns. We’ve made integration easy with our APIs, enabling developers to seamlessly incorporate Detect for robust authentication and protection against AI fraud.
We’re committed to developing AI ethically and will continue to update Detect to defend against the latest speech synthesis models, providing real-time protection for our customer’s audio content library.
API Updates: More Metrics and Account Transparency
We’re constantly striving to enhance the user experience, and our latest API upgrades offer insights and transparency. Users can now access audio metrics associated with recordings, gaining valuable insights into their audio content. We’ve also made account management and multi-team usage more convenient with the addition of the Account endpoint. This empowers customers to query account information at the user and/or team level, enabling a programmatic breakdown of their account usage. With these API updates, we aim to provide seamless integration and comprehensive data analysis, allowing users to optimize their audio content.
Voice AI On The Horizon
As we continue to build new products and enhance existing ones, we will continue to keep you informed of our progress. August should be another month full of new and valuable updates to enhance your content creation. We look forward to sharing our next product release!
If you’re interested in kicking off your next project with generative voice AI, please click the button below to schedule time to speak with a team member.