In our ongoing commitment to developing cutting-edge AI security solutions that ensure AI safety of our customer’s voice data, we’re proud to announce significant updates to PerTh, our deep neural network AI watermarker. These upgrades vastly improve the watermark’s embedding, persistence, and detection capabilities to combat deepfake AI voice and audio data manipulation. Our machine learning model continues to train with the watermarker, allowing it to embed PerTh automatically into our generated voice AI audio files. Moreover and most remarkable, our AI watermarker is now able to be trained by any speech synthesis model and remain persistent through the training process. Furthermore, we’ve enhanced our ability to efficiently pinpoint where the AI watermark is being embedded in the audio data. Let’s take a closer look at the updates and the motivation behind their development.
Nuances of The Neural Speech AI Watermark
Our neural AI watermarker has similar functionality to a visual watermark used in a customer’s video content library. The watermark pictured below indicates that the video is owned/created by Resemble AI and belongs to our content library. With audio, we don’t want to alter or degrade the audio quality by embedding intrusive data. PerTh is an inaudible neural watermarker. By this definition, PerTh embeds its AI watermark data into the audio file in an imperceivable and difficult-to-detect way. The intention is to identify whether the audio data has been manipulated in order to protect a customer’s IP catalog from deepfake AI voice content. In addition, the AI watermark is unoticeable to the user, ensuring that the Resemble generated AI audio file will maintain its original quality. For a technical background on PerTh’s funcitionality please read our blog post here.
Watermarkers are used to identify the content creator.
AI Security: Lagging Enterprise Adoption of Generative AI
Built on the principles of psychoacoustics, the study of human sound perception, the PerTh watermarker was originally developed by our team to tag Resemble-generated AI voice content for internal purposes. With the reluctant adoption of generative AI stemming from AI data privacy and IP catalog copyright infringement concerns, the move to productize the AI watermarker became a civic obligation of sorts. Being so closely connected with the engineering community, we see that engineering leaders are uncomfortable with their teams deploying OpenAI’s ChatGPT and even GitHub Copilot at scale until their AI data privacy concerns are addressed.
EU AI ACT: AI Safety Regulation
In parallel to our neural watermarkers efforts around IP catalog copyright infringement, the EU AI Act will require that companies leveraging AI share their data sources. This would affect creators of LLM’s (large language AI models) such as the likes of Google (i.e. Bard), OpenAI (i.e. ChatGPT), and Microsoft (i.e. Bing) who will have to assess and mitigate various AI data privacy risks before the AI tools can be publicly available. The most impactful requirement of this AI regulation is that these companies disclose “the use of training data protected under copyright law”. At this juncture, many of these companies are not sharing this information for fear of legal action for IP copyright infringement. This has been stirring up a debate on the ethics of AI.
In complimentary efforts, individual companies are proactively speaking out about their data privacy, raising concerns about ai ethics. Stackoverflow wants to protect its community’s forum data used to answer technical questions. Policing the internet can be a very difficult task and data attribution is virtually impossible. There’s currently no way for Stackoverflow to prove that an AI model is using their data during training. In the case of audio data, we invite enterprises to protect their IP by leveraging our Neural Speech Watermarker solution for AI fraud detection. Ironically, AI for fraud detection is another positive use case of artificial intelligence to circumvent AI misuse. AI safety tools like PerTh, fight AI fraud with AI security. Similar to Apple’s data privacy efforts, it’s our firm belief that enterprises should have control over who uses their voice data. Without legal consent, AI models such as LLMs shouldn’t be able to access company data to train their models.
AI Fraud Detection: Improved AI Watermark Detection & Persistence
The significance of these concerns has spurred the recent improvements of our AI watermarker. Today it becomes a complete solution to thwart copyright infringement against our customer’s content library. The recent updates, allow us to identify the precisely where the watermark has been embedded within an audio file. This provides us with more clarity on the origin of an audio clip and efficiently verifies whether it has been tampered with. Having a greater understanding of the watermark’s location within the audio data provides an additional layer of AI fraud prevention. It also ensures that any modification in the audio data can be detected quickly and efficiently.
To further cement the robustness of our AI watermark, we’ve taken another giant step forward by fortifying the stability of our watermark which now persists through training. Simply, Resemble AI’s watermark can now be trained by another speech synthesis model and remain persistent in the audio data. This new capability not only reinforces the fortitude of our AI watermark but also facilitates better tracking and AI fraud detection. Furthermore it enhances our ability to mitigate the risk of deepfake AI voice content. Our machine learning model is becoming increasingly adept at distinguishing the nuanced differences between real and deepfake AI voice. The developments build on the AI security and reliability of our platform.
How Does The Neural Speech Watermarker Help Customers?
What does this mean for our enterprise customers? Not only do we tag all AI voice content generated in the Resemble AI app, but we’re able to tag any piece of audio content regardless of the origin. This means audio files from competitive generative voice AI companies can be tagged and tracked. Once an audio file is tagged with our PerTh neural watermarker, our customers have the AI security and assurance that they will have control over their audio data.
In the event a nefarious actor were to scrape audio data from Audible.com to train an AI model on Audible’s audiobook data, Audible wouldn’t be able to tell that their data was being used unethically. With our AI watermarker solution, we encourage Audible to watermark its entire audio content library. If they had concerns that a 3rd party was infringing on their IP, we would be able to analyze the audio data and identify if our watermarker was present. This would validate the authenticity of the audio file. Below is a simplified visual representation of the PerTh watermarker’s persistence after being trained by another speech synthesis model.
PerTh Neural Speech Watermarker in action. Please note that this is just for visualization purposes.
PerTh: Evolving For Advanced AI Security Against AI Misuse
We believe that the evolution of our PerTh Watermarker is vital to maintaining the integrity of voice AI-generated content and preventing the proliferation of deepfake AI voice. With the updates in precision watermarker embedding and the watermarker’s persistence through training, we’re further bolstering our deepfake detection against AI misuse. Not only will all future Resemble-generated voice AI content include the PerTh watermarker but we’re also able to embed it into your existing audio content library. If you’re interested in a walkthrough of PerTh by a Resemble team member, please click the button below to schedule a demo.