Leading Voice AI, Watermarking and Multimodal Deepfake Detection Models

RESEMBLE AI MODELS

Our model portfolio

Production-grade models for watermarking and deepfake detection with additional voice research models. Available via API or on-premises deployment.



Verify — Watermarking

PerTh

VERIFY

A deep neural network that embeds imperceptible, psychoacoustically-masked watermarks into audio at the point of creation. Survives MP3 compression, audio editing, noise, and codec transforms. Embedded in every Chatterbox output by default. MIT-licensed and available as open source.

99.9% decode accuracy

Survives compression

Audio-only watermarking

Open source and free

LEARN MORE GITHUB Docs

PerTh Multimodal

VERIFY

Enterprise and compliance-oriented multimodal model, extending PerTh's imperceptible, tamper-resistant watermarking across audio, video, image, and text in a single API. Each modality uses a purpose-built algorithm, with robustness tuned against real-world transformations. Supports explicit marks (custom encodes) for audio, image, and video.

Audio, image, video and text

Real-time encode and decode

API available

EU AI Act Article 50 oriented

GET STARTED LEARN MORE DOCS BENCHMARKS

Resemblyzer

OPEN SOURCE

Deep learning voice encoder that derives a high-dimensional speaker representation from a few seconds of audio. Used for voice authentication, speaker diarization, and similarity scoring.



Speaker embedding



Voice similarity



Diarization

Github



Detect — Deepfake detection

DETECT-3B Omni

NEW

3 billion parameter multimodal detection model. The only deepfake detector covering audio, image, and video in a single unified architecture. Zero-day coverage for new generative models in under an hour.



#1 benchmarked detection



<300ms detection



3B parameters



160+ models



51 languages



On-prem

contact us LEARN MORE API DOCS Benchmarks

Resemble Intelligence

EXPLAINABILITY LAYER

Know why, not just what. Intelligence enhances DETECT-3B Omni's output with human-readable forensic explanations in real time — surfacing which artifacts triggered a flag and why. Built for compliance teams, legal review, and trust & safety workflows that need evidence, not just a score.



Human-readable reports



Real-time



Audit-ready



Built for compliance

LEARN MORE Docs

DEEPFAKE DETECTION COVERAGE

Models we detect

DETECT-3B Omni is battle-tested against 160+ generative AI models. Coverage spans every major audio, image, and video generator — with zero-day updates when new models launch.



Audio



Image



Video

ElevenLabs

Text-to-speech

Covered

OpenAI TTS

TTS-1/TTS-1 HD

Covered

Azure TTS

Microsoft Neural TTS

Covered

Google TTS

WaveNet / Neural2

Covered

AWS Polly

Neural / Standard

Covered

Suno

AI music generation

Covered

Udio

AI music generation

Covered

PlayHT

Text-to-speech

Covered

Murf

Text-to-speech

Covered

Descript

Overdub / TTS

Covered

DALL-E 3

Open AI

98% accuracy

Midjourney

v6 / v7

98% accuracy

Stable Diffusion

SDXL / SD 3.5

94% accuracy

Nano Banana

Image generation

Covered

Flux

Black Forest Labs v2

99% accuracy

Gemini

Imagen 3 / 2.0 Flash

99% accuracy

GPT-4o

OpenAI image gen

99% accuracy

StyleGAN

v2 / v3

>99% accuracy

Ideogram

v2 / v3

Covered

Leonardo AI

Image generation

Covered

Sora

OpenAI

Covered

Veo

Google DeepMind

>99% accuracy

Runway

Gen-3 Alpha

Covered

HeyGen

Avatar video

Covered

Pika

Video generation

Covered

Kling

Kuaishou

Covered

Seedance

ByteDance

covered

Synthesia

AI avatar video

Covered

D-ID

Talking avatar

Covered

Stable Diffusion

Overdub / TTS

Covered

Plus...

Bark

VALL-E

VALL-E2

YourTTS

Tortoise TTS

XTTS

StyleTTS 2

MetaVoice

Kokoro

VoiceBox

NaturalSpeech 3

HierSpeech++

...and over 100 more

Stable Diffusion XL

Latent Diffusion

eDiff-I

GLIDE

Kandinsky

DeepFloyd IF

Playground v3

Grok Aurora

Recraft v3

Lumina Image

Emu (Meta)

...and more

CogVideoX

Wan 2.1

Open-Sora

ModelScope

VideoCrafter

AnimateDiff

VideoPoet

Make-A-Video

Gen-2

Morph Studio

Mochi 1

HunyuanVideo



Voice AI research

Chatterbox

MIT OPEN SOURCE

Production-grade TTS with zero-shot voice cloning. Outperforms ElevenLabs in blind evaluations. First open-source model with emotion exaggeration control.



Sub-200ms



Voice cloning



Emotion control



PerTh watermarked

Get started LEARN MORE DOCS HUGGING FACE

Chatterbox Pro

ENTERPRISE

Enterprise tier with custom fine-tuning on brand vocabulary, SLAs, guaranteed uptime, sub-200ms streaming, and advanced watermarking and detection.



Custom fine-tuning



SLA



On-prem



23 languanges

Contact Us LEARN MORE Docs

Chatterbox Turbo

NEW

350M parameter architecture optimized for voice agents. 1-step decoder, native paralinguistic tags — [cough], [laugh], [chuckle]. Built for latency-critical production.



Real-time



350M params



Paralinguistic



ONNX available

get started LEARN MORE GITHUB HUGGING FACE

Chatterbox Multilingual

23-language TTS with voice cloning across Arabic, Chinese, German, French, Hindi, Japanese, Korean, Spanish, and 15 more. MIT licensed.



23 languages



Voice cloning



MIT licensed

get started Learn MORE Docs GitHub

DramaBox

MIT OPEN SOURCE

Describe the speaker, the scene, and the delivery and DramaBox generates the performance. Every expressive output is watermarked for provenance.



Natural language prompts



Secure by design



Super expressive output



Watermarked at creation

GET STARTED LEARN MORE GITHUB HUGGING FACE

AVAILABLE ON-PREM

Run every model inside your perimeter — or in the cloud

Resemble AI models for watermarking and deepfake detection are available for fully air-gapped on-premises deployment. No telemetry. No external API calls. Your data stays where it should.



Model access

Chatterbox and DETECT-3B Omni with full model weights on your GPUs.



Real-time analysis

Real-time audio, image, and video analysis behind your firewall.



Zero dependencies

Kubernetes and local Python packages with no cloud dependencies.



Compliance ready

Meets EU AI Act, HIPAA, SOC 2, and financial services data residency requirements.

Secure with open foundational models

Our model portfolio

Models we detect

Run every model inside your perimeter — or in the cloud