GPT-SoVITS

⭐ 58.5k MIT Python 3.2.0

King of Chinese TTS communities: clone voices from 5-second samples, commercially usable under MIT license

📋 Info

GitHub Stars⭐ 58.5k Stars
LicenseMIT
LanguagePython
Version3.2.0
Updated2026-06-01

📖 Overview

GPT-SoVITS is the most active project in the Chinese AI voice cloning community, boasting 58k Stars. It enables zero-sample voice cloning using just 5 seconds of audio sample, with a 1-minute fine-tuning process significantly improving similarity scores. It supports five languages: Chinese, English, Japanese, Korean, and Cantonese. The one-click WebUI package includes a complete toolset for voice separation, slicing, ASR annotation, and more. Inference only requires 4GB of GPU memory. Its MIT license explicitly permits commercial use. It is suitable for individual creators, Vtubers, audiobook recordists, and content creators needing voice dubbing.

✨ Features

  • One-click zero-sample voice cloning for 5-second audio clips
  • 5 languages: Chinese, English, Japanese, Korean, and Cantonese
  • One-click installer package (beginner-friendly)
  • Inference only requires 4GB of GPU memory.
  • MIT License — explicitly permits commercial use

Advertisement

🚀 Quick Start

pip install -r extra-req.txt --no-deps
pip install -r requirements.txt

🔗 Related Tools