](https://deep-paper.org/en/paper/2412.08988/images/cover.png)
Mastering the Art of Emotional Dubbing: A Deep Dive into EmoDubber
Have you ever watched a dubbed movie where the voice acting felt completely detached from the actor’s face? Perhaps the lips stopped moving, but the voice kept going, or the character on screen was screaming in rage while the dubbed voice sounded mildly annoyed. This disconnect breaks immersion instantly. This challenge falls under the domain of Visual Voice Cloning (V2C). The goal is to take a text script, a video clip of a speaker, and a reference audio track, and then generate speech that matches the video’s lip movements while cloning the reference speaker’s voice. ...
](https://deep-paper.org/en/paper/2409.02224/images/cover.png)
](https://deep-paper.org/en/paper/2503.13016/images/cover.png)
](https://deep-paper.org/en/paper/file-2003/images/cover.png)
](https://deep-paper.org/en/paper/2412.00133/images/cover.png)
](https://deep-paper.org/en/paper/2504.02199/images/cover.png)
](https://deep-paper.org/en/paper/2503.20101/images/cover.png)
](https://deep-paper.org/en/paper/2412.12861/images/cover.png)
](https://deep-paper.org/en/paper/2504.14920/images/cover.png)
](https://deep-paper.org/en/paper/2412.04464/images/cover.png)
](https://deep-paper.org/en/paper/2503.16964/images/cover.png)
](https://deep-paper.org/en/paper/2410.23780/images/cover.png)
](https://deep-paper.org/en/paper/file-1994/images/cover.png)
](https://deep-paper.org/en/paper/2502.16652/images/cover.png)
](https://deep-paper.org/en/paper/2412.05826/images/cover.png)
](https://deep-paper.org/en/paper/file-1991/images/cover.png)
](https://deep-paper.org/en/paper/2502.20256/images/cover.png)
](https://deep-paper.org/en/paper/2411.18180/images/cover.png)
](https://deep-paper.org/en/paper/2504.08541/images/cover.png)
](https://deep-paper.org/en/paper/file-1985/images/cover.png)