AlexandaJerry

Alexanda's repositories

Voice-Recognition-to-Text-Tool-

Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务，输出json、srt字幕带时间戳、纯文字格式

Language:PythonGPL-3.0100

auto_labeling_for_BERT_VITS2

这个项目是数据预处理。第一步是对获取到的音频做处理，结合Funasr的时间戳去掉空背景音。也包含了喂给BERT前的label

000

Automatic_Speech_Annotator

Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automatic speech recognition

000

bulk_transcribe_youtube_videos_from_playlist

Easily take an entire YouTube playlist and turn it into high quality transcripts using Whisper.

MIT000

ChatPaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研，利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Language:PythonNOASSERTION000

Chenyme-AAVT-

这是一个全自动（音频）视频翻译项目。利用Whisper识别声音，AI大模型翻译字幕，最后合并字幕视频，生成翻译后的视频。

Language:PythonMIT000

ctc-forced-aligner

Text to speech alignment using CTC forced alignment

000

Dataset_Generator_For_VITS

基于达摩院视频切割技术的视频转换为短音频的vits数据集生成工具 A VITS Dataset Generation Tool for Converting Video to Short Audio Based on Damo Academy Video Cutting Technology

Language:ShellMIT000

ears_dataset

Expressive Anechoic Recordings of Speech (EARS)

NOASSERTION000

emotion2vec

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

000

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Apache-2.0000

faster-whisper-GUI

faster_whisper GUI with PySide6

Language:PythonAGPL-3.0000

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT000

label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

Apache-2.0000

leedl-tutorial

《李宏毅深度学习教程》（李宏毅老师推荐👍），PDF下载地址：https://github.com/datawhalechina/leedl-tutorial/releases

NOASSERTION000

MakeDiffSinger

Pipelines and tools to build your own DiffSinger dataset.

BSD-3-Clause000

MediaCrawler

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频｜评论爬虫、微博帖子｜评论爬虫

Apache-2.0000

ParaClipper

一款基于FunASR高准确率开源语音识别模型的自动化视频剪辑工具/A video clipping tool based on FunASR open source ASR model.

MIT000

Pink-Trombone

A programmable version of Neil Thapen's Pink Trombone

GPL-3.0000

PyQt-Fluent-Widgets

A fluent design widgets library based on C++ Qt/PyQt/PySide. Make Qt Great Again.

GPL-3.0000

pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，并添加配音

GPL-3.0000

SpeechTasks

This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.

000

这是一个简单的TTS后端项目基于https://github.com/RVC-Boss/GPT-SoVITS 并提供了一些推理优化的特性/This is a simple TTS backend project based on https://github.com/RVC-Boss/GPT-SoVITS and provides some inference optimization features:

Language:Python000

upload_only

000

whisper-web

ML-powered speech recognition directly in your browser

000

Whisper-WebUI

A Web UI for easy subtitle using whisper model.

Apache-2.0000

X-AnyLabeling

Effortless data labeling with AI support from Segment Anything and other awesome models.

GPL-3.0000

AlexandaJerry

Alexanda's repositories

Voice-Recognition-to-Text-Tool-

auto_labeling_for_BERT_VITS2

Automatic_Speech_Annotator

bulk_transcribe_youtube_videos_from_playlist

CapsWriter-Offline

ChatPaper

Chenyme-AAVT-

ctc-forced-aligner

Dataset_Generator_For_VITS

ears_dataset

emotion2vec

EmotiVoice

faster-whisper-GUI

Fricative_analysis

Galgame-Engine-Collect

GPT-SoVITS

label-studio

leedl-tutorial

MakeDiffSinger

MediaCrawler

ParaClipper

Pink-Trombone

PyQt-Fluent-Widgets

pyvideotrans

SpeechTasks

TTS-for-GPT-soVITS

upload_only

whisper-web

Whisper-WebUI

X-AnyLabeling