SadTalker

⭐ 13.9k Apache-2.0 Python 3.0.0

Classic digital human generation: one photo + audio = a realistic talking video, with 3D motion coefficient learning

📋 Info

GitHub Stars⭐ 13.9k Stars
LicenseApache-2.0
LanguagePython
Version3.0.0
Updated2026-03-20

📖 Overview

SadTalker is a renowned open-source benchmark in the field of AI digital humans, boasting 13.9k Stars. It allows users to create realistic talking videos using just a face photo and an audio clip. Its key innovation lies in 3D motion coefficient learning—first generating 3D facial motion parameters such as head posture, expressions, and blinking before rendering them into 2D video, resulting in natural-looking outputs without stiffness. It also supports full-body mode. The Gradio WebUI makes it easy to use, making it suitable for creators who need to quickly produce talking avatar videos.

✨ Features

  • One photo + audio = video with spoken words
  • 3D motion coefficient learning – natural results
  • Fully automatic head pose, expression, and blinking control
  • Full-body mode (head + upper body)
  • The Gradio WebUI features simple usage.

Advertisement

🚀 Quick Start

git clone https://github.com/OpenTalker/SadTalker.git
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

🔗 Related Tools