Kokoro简介

Kokoro TTS 简介
Kokoro TTS是由 hexgrad 团队开发的一款轻量级、高性能的开源文本转语音(TTS)模型,以其卓越的效率和出色的语音合成质量在业界脱颖而出。特别适合在仅有CPU的环境下部署

安装CPU版本的PyTorch

pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cpu

手动安装espeak-ng依赖

下载地址: https://github.com/espeak-ng/espeak-ng/releases
在这里插入图片描述
安装完成之后需要配置一下环境变量, 否则启动的时候会报找不到espeak

$env:PHONEMIZER_ESPEAK_LIBRARY = "c:\Program Files\eSpeak NG\libespeak-ng.dll"
$env:PHONEMIZER_ESPEAK_PATH = "c:\Program Files\eSpeak NG"
setx PHONEMIZER_ESPEAK_LIBRARY "c:\Program Files\eSpeak NG\libespeak-ng.dll"
setx PHONEMIZER_ESPEAK_PATH "c:\Program Files\eSpeak NG"

其他依赖安装

pip install kokoro
pip install ordered-set
pip install cn2an
pip install pypinyin_dict

模型下载

# 设置镜像地址(可选,加速下载)
export HF_ENDPOINT=https://hf-mirror.com

# 下载模型(只需要v1.1版本)
huggingface-cli download --resume-download hexgrad/Kokoro-82M-v1.1-zh --local-dir ./ckpts/kokoro-v1.1

测试代码

import torch
import time
from kokoro import KPipeline, KModel
import soundfile as sf

# 设置设备为CPU
device = 'cpu'

# 加载声音文件
voice_zf = "zf_001"
voice_zf_tensor = torch.load(f'ckpts/kokoro-v1.1/voices/{voice_zf}.pt', weights_only=True)

# 模型路径配置
repo_id = 'hexgrad/Kokoro-82M-v1.1-zh'
model_path = 'ckpts/kokoro-v1.1/kokoro-v1_1-zh.pth'
config_path = 'ckpts/kokoro-v1.1/config.json'

# 加载模型到CPU
model = KModel(model=model_path, config=config_path, repo_id=repo_id).to(device).eval()

def speed_callable(len_ps):
    speed = 0.8
    if len_ps <= 83:
        speed = 1
    elif len_ps < 183:
        speed = 1 - (len_ps - 83) / 500
    return speed * 1.1

# 创建管道
zh_pipeline = KPipeline(lang_code='z', repo_id=repo_id, model=model)

# 测试文本
sentence = '你好,这是一个语音合成测试。'

# 语音合成
start_time = time.time()
generator = zh_pipeline(sentence, voice=voice_zf_tensor, speed=speed_callable)
result = next(generator)
wav = result.audio

# 计算处理时间
speech_len = len(wav) / 24000
print('生成语音长度: {}秒, 实时因子: {}'.format(speech_len, (time.time() - start_time) / speech_len))

# 保存音频文件
sf.write('output_cpu.wav', wav, 24000)
print('语音文件已保存为: output_cpu.wav')
Logo

智能硬件社区聚焦AI智能硬件技术生态,汇聚嵌入式AI、物联网硬件开发者,打造交流分享平台,同步全国赛事资讯、开展 OPC 核心人才招募,助力技术落地与开发者成长。

更多推荐