⚡️ Introducing Rapid Voice Cloning

Q

Resemble ‘Detect’: Antivirus For AI

Jul 12, 2023

Over the first half of 2023, AI has continued to captivate media attention by revolutionizing industries. From Levi’s pursuit of inclusive clothing to combatting forest fires, AI is becoming embedded into the fabric of society. Yet, it hasn’t all been rosy. The rise of deepfakes poses a new, daunting challenge that has sparked concerns about AI ethics. Deepfakes have reared their face in the music industry and have found a home in election smear campaigns.

At Resemble AI, prioritizing the security of our enterprise customer’s AI voice-generated content is paramount. We’ve innovated products such as Resemblyzer and our Neural Speech Watermarker to safeguard data. Today, we are pleased to introduce ‘Detect,’ the antivirus for AI. The groundbreaking deepfake detector is designed to protect your voice content. Let’s take a deeper look at how the tool detects deepfakes and how it can help combat voice AI fraud.

An Eye for Deepfake Detection: How ‘Detect’ Works
In this era of generative AI, determining the authenticity of content, particularly voice is vital to safeguarding IP. Real versus fake audio clips may have subtle differences that are inaudible to the human ear. For instance, the unauthorized Drake and The Weeknd deepfake song that was released in the Spring sounded authentic to listeners. The sonic material that results from the editing or manipulation of sound, known as artifacts, live within the audio data and are typically inaudible to humans.

Commonly, we’re familiar with an audio clip sound wave represented by sharp peaks and valleys. However, this doesn’t provide us with the entire picture. Visualizing these audio artifacts begins with understanding how humans perceive sound. We hear a mixture of frequencies that oscillate in loudness and weave in and out over time. To magnify these intricacies, a tool known as a ‘Spectrogram’ is employed. The Spectrogram doesn’t display data in relation to time (temporal data), but it reveals the unique frequencies that comprise the sound at any given point in time.

Within specific sections of the spectrogram are where the artifacts between a real and a fake audio clip are often contained. Again, while inaudible to humans, these are portions of audio data that can be learned by an AI model. Detect harnesses this technique, but advances it further. Our deepfake detector employs a cutting-edge neural network that learns its version of a spectrogram. It doesn’t solely focus on frequency but incorporates temporal data into its analysis. This method creates a time-frequency embedding, supplying the network with the necessary time and frequency information to make optimal predictions between real and fake audio files.

The audio data is then run through a Deep Learning Model which processes the time and frequency data from the waveform, which is then evaluated by a Classifier. The Classifier outputs a probability ranging from 0 to 1, with 1 indicating that there is a high probability that the audio is fake. Resemble Detect validates the authenticity of audio data to expose deepfake audio in real time with up to 98% accuracy.

Resemble AI Deepfake Detector Diagram

Detect’s deep learning model at work.

To provide real-life context below is an audio clip of a Resemble AI Marketplace voice, Tarkos, that consists of Tarkos’ voice actor’s real audio data (initial :05) as well as synthesized audio (:05 to 0:11). The chart below it is a visual representation of the audio data that Detect is analyzing to determine whether the audio is real or fake. 

Detect: Resemble AI's Deepfake Detector

The model’s analysis of Tarko’s audio data detects an increase in the probability that the audio file is fake at 6 seconds in. 

How Detect Can Outsmart Everyday AI Fraud

Consider the incident of the unauthorized Drake and The Weeknd song. Without the musical artists’ explicit consent to create synthetic voice content, AI tools were used to create the track, which was indistinguishable from the artists’ authentic voices. Similarly in the film industry, there is potential for unauthorized deepfake content to be produced from popular franchises such as Universal Pictures’ Fast & Furious or Disney’s Moana. Last but not least, the political sphere has become a prime breeding ground for misinformation through generative AI. The voices of Joe Biden, Connecticut Democrat Richard Blumenthal, and candidates of the Turkish Presidential election have been cloned and synthesized into deepfake audio. Detect is capable of scrutinizing all of these types of audio content and identifying whether it’s genuine or fabricated.


API Deepfake Detection Simplified

Detect is a market-ready solution for enterprise customers designed to provide enhanced security over their audio content library against all contemporary generative AI speech synthesis solutions. It boasts strong performance that can handle high levels of volume, but with low complexity for easy integration. We’ve made this technology simple to integrate via our APIs, empowering developers and those in need of robust authentication. Upon deployment, customer’s data will be labeled ‘real’ or ‘fake’ or they will have the ability to see chunk-by-chunk analysis of their audio data to identify the exact location of the manipulated audio.

Detect has been tested against major cloud vendors as well as open source implementations. As the field of AI safety progresses, we’ll routinely update the detection models to provide our customer’s audio libraries with state-of-the-art protection. Similar to antivirus software, Detect,  the antivirus for AI will be updated automatically to provide deepfake detection against the latest speech synthesis models.

Get Started By Integrating Detect Today
By prioritizing safety in generative AI, we’re making strides toward ethical AI practice (Ethics Statement). Our newly introduced Detect deepfake detection tool coupled with our recently enhanced Neural Speech Watermarker, form a formidable line of defense against voice AI fraud. Quoting one of our engineers who spearheaded the Detect project, “Our Neural Speech Watermarker secures our client’s voice, Detect has the potential to secure everyone’s voice”.

Outsmart AI fraud by leveraging the power of Resemble Detect. Click the button below to schedule a demo with one of our experts to discuss how our AI safety stack can protect your IP.

More From This Category

Introducing Rapid Voice Cloning: Create AI Voices in Seconds

Introducing Rapid Voice Cloning: Create AI Voices in Seconds

We're excited to announce the launch of our groundbreaking new feature: Rapid Voice Cloning. This innovative technology allows you to create high-quality voice clones faster and easier than ever before, unlocking new possibilities for your voice-enabled projects....

read more