darwintree / better-n46-whisper

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

README

better-n46-whisper 是一个基于 whisperX 的工具项目,旨在为跨语种字幕制作等音视频处理工作提供预处理支持。本项目的目标是实现停止维护的N46Whisper功能的超集,在其功能基础上包含以下feature:

  • run in local env or colab
  • bunch translation
    • better context
    • and less token consumption
  • different LLM translation backend
    • Remote: ChatGLM
    • Local: ChatGLM, Baichuan
  • speaker tag support
    • better context for LLM input
    • different subtitle output style for different user
  • GUI(low priority)
  • async translation request
  • align translation using LLM
  • support karaoke
    • raw: uvr5 preprocess
    • lrc

本项目正在前期开发中。

TODOs other than desired features

  • better speaker split
  • better line width style
  • contribute code back to whisperX
  • language support other than Japanese

Initial Roadmap

  1. WhisperX and LLM support with practical documentation. Runnable in local env or within Colab
  2. Inherit N46Whisper ipynb features
  3. async LLM translation
  4. more remote LLM support
  5. local LLM support

About


Languages

Language:Python 65.9%Language:Jupyter Notebook 34.1%