AI Security: Protect Your Content Library From Copyright Infringement With Our Neural Speech AI Watermarker

Jun 15, 2023

Another Step In Our Commitment To AI Safety

In our ongoing commitment to developing cutting-edge AI security solutions that ensure AI safety of our customer’s voice data, we’re proud to announce significant updates to PerTh, our deep neural network AI watermarker. These upgrades vastly improve the watermark’s embedding, persistence, and detection capabilities to combat deepfake AI voice and audio data manipulation. Our machine learning model continues to train with the watermarker, allowing it to embed PerTh automatically into our generated voice AI audio files. Moreover and most remarkable, our AI watermarker is now able to be trained by any speech synthesis model and remain persistent through the training process. Furthermore, we’ve enhanced our ability to efficiently pinpoint where the AI watermark is being embedded in the audio data. Let’s take a closer look at the updates and the motivation behind their development. 

Nuances of The Neural Speech AI Watermarker 

Our neural AI watermarker has similar functionality to a visual watermark used in a customer’s video content library. The watermark pictured below indicates that the video is owned/created by Resemble AI and belongs to our content library. With audio, we don’t want to alter or degrade the audio quality by embedding intrusive data. PerTh is an inaudible neural watermarker. By this definition, PerTh embeds its AI watermark data into the audio file in an imperceivable and difficult-to-detect way. The intention is to identify whether the audio data has been manipulated in order to protect a customer’s IP catalog from deepfake AI voice content. In addition, the AI watermark is unnoticeable to the user, ensuring that the Resemble generated AI audio file will maintain its original quality. For a technical background on PerTh’s functionality please read our blog post here.

One of the key advantages of the voice marketplace is the vast array of diverse female and male voices, accents, and languages that are available. Inclusivity is no longer an option but a necessity for any business or developer aiming to create products that cater to a global audience. By leveraging the voice marketplace, you gain access to an extensive collection of natural, high-quality voices from different linguistic backgrounds. This allows your applications to become more accessible and engaging for users around the world. Also, the use cases are robust for female voice over and male voice overs spanning across games, character, IVR, and more. Additionally, the marketplace provides individuals with speech impairments the ability to find an AI voice that suits their commercial voice over needs. Listen to the diverse samples below to get a feeling for the comprehensive range of AI voices.

 

Video Watermark Example

Watermarkers are used to identify the content creator.

AI Security: Lagging Enterprise Adoption of Generative AI

Built on the principles of psychoacoustics, the study of human sound perception, the PerTh watermarker was originally developed by our team to tag Resemble-generated AI voice content for internal purposes. With the reluctant adoption of generative AI stemming from AI data privacy and IP catalog copyright infringement concerns, the move to productize the AI watermarker became a civic obligation of sorts. Being so closely connected with the engineering community, we see that engineering leaders are uncomfortable with their teams deploying OpenAI’s ChatGPT and even GitHub Copilot at scale until their AI data privacy concerns are addressed.

EU AI ACT: AI Safety Regulation 

In parallel to our neural watermarkers efforts around IP catalog copyright infringement, the EU AI Act will require that companies leveraging AI share their data sources. This would affect creators of LLM’s (large language AI models) such as the likes of Google (Bard), OpenAI (ChatGPT), and Microsoft (Bing) who will have to assess and mitigate various AI data privacy risks before the AI tools can be publicly available. The most impactful requirement of this AI regulation is that these companies disclose “the use of training data protected under copyright law”. At this juncture, many of these companies are not sharing this information for fear of legal action for IP copyright infringement. This has been stirring up a debate on the ethics of AI.

AI Ethics: AI Safety and Data Privacy Concerns
In complimentary efforts, individual companies are proactively speaking out about their data privacy, raising concerns about ai ethics. Stackoverflow wants to protect its community’s forum data used to answer technical questions. Policing the internet can be a very difficult task and data attribution is virtually impossible. There’s currently no way for Stackoverflow to prove that an AI model is using their data during training. In the case of audio data, we invite enterprises to protect their IP by leveraging our Neural Speech Watermarker solution for AI fraud detection. Ironically, AI for fraud detection is another positive use case of artificial intelligence to circumvent AI misuse. AI safety tools like PerTh, fight AI fraud with AI security. Similar to Apple’s data privacy efforts, it’s our firm belief that enterprises should have control over who uses their voice data. Without legal consent, AI models such as LLMs shouldn’t be able to access company data to train their models.

AI Fraud Detection: Improved AI Watermark Detection & Persistence
The significance of these concerns has spurred the recent improvements of our AI watermarker. Today it has become a complete solution to thwart copyright infringement against our customer’s content library. The recent updates, allow us to identify precisely where the watermark has been embedded within an audio file. This provides us with more clarity on the origin of an audio clip and efficiently verifies whether it has been tampered with. Having a greater understanding of the watermark’s location within the audio data provides an additional layer of AI fraud prevention. It also ensures that any modification in the audio data can be detected quickly and efficiently.

To further cement the robustness of our AI watermark, we’ve taken another giant step forward by fortifying the stability of our watermark which now persists through training. Simply, Resemble AI’s watermark can now be trained by another speech synthesis model and remain persistent in the audio data. This new capability not only reinforces the fortitude of our AI watermark but also facilitates better tracking and AI fraud detection. Furthermore, it enhances our ability to mitigate the risk of deepfake AI voice content. Our machine learning model is becoming increasingly adept at distinguishing the nuanced differences between real and deepfake AI voice. The developments build on the AI security and reliability of our platform.

How Does The Neural Speech AI Watermarker Help Customers?  

What does this mean for our enterprise customers? Not only do we tag all AI voice content generated in the Resemble AI app, but we’re also able to tag any piece of audio content regardless of the origin. This means audio files from competitive generative voice AI companies can be tagged and tracked. Once an audio file is tagged with our PerTh neural watermarker, our customers have AI security and assurance that they will have control over their audio data.

In the event a nefarious actor was to scrape audio data from Audible.com to train an AI model on Audible’s audiobook data, Audible wouldn’t be able to tell that their data was being used unethically. With our AI watermarker solution, we encourage Audible to watermark its entire audio content library. If they had concerns that a 3rd party was infringing on their IP, we would be able to analyze the audio data and identify if our watermarker was present. This would validate the authenticity of the audio file. Below is a simplified visual representation of the PerTh watermarker’s persistence after being trained by another speech synthesis model.

PerTh Neural Speech Watermarker Remains Persistent Through Training

PerTh Neural Speech Watermarker in action. Please note that this is just for visualization purposes. 

PerTh: Evolving For Advanced AI Security

We believe that the evolution of our PerTh Watermarker is vital to maintaining the integrity of voice AI-generated content and preventing the proliferation of deepfake AI voice. With the updates in precision watermarker embedding and the watermarker’s persistence through training, we’re further bolstering our deepfake detection against AI misuse. Not only will all future Resemble-generated voice AI content include the PerTh watermarker but we’re also able to embed it into your existing audio content library. If you’re interested in a walkthrough of PerTh by a Resemble team member, please click the button below to schedule a demo.

More From This Category

Introducing ‘Edit’ by Resemble AI: Say No More Beeps

Introducing ‘Edit’ by Resemble AI: Say No More Beeps

In audio production, mistakes are inevitable. You’ve wrapped up a recording session, but then you notice a mispronounced word, an awkward pause, or a phrase that just doesn’t flow right. The frustration kicks in—do you re-record the whole segment, or do you spend...

read more
Introducing Resemble Identity & Audio Intelligence

Introducing Resemble Identity & Audio Intelligence

We're excited to unveil two groundbreaking models designed to revolutionize your interaction with audio: Resemble Identity and Resemble Audio Intelligence. These tools enhance speaker recognition, real-time analysis, voice-based authentication, and more. Resemble...

read more
DETECT-2B now capable of detecting AI generated music

DETECT-2B now capable of detecting AI generated music

In the ever-evolving landscape of AI-generated content, the rise of deepfake technology has posed significant challenges in distinguishing real from fake. At Resemble AI, we've made significant advances in detecting deepfakes in speech, and now we're extending our...

read more