WhiteFu

WhiteFu's repositories

llm-paper-daily

Daily updated LLM papers. 每日更新 LLM 相关的论文，欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个

100

ai-audio-startups

Community list of startups working with AI in audio and music technology

Apache-2.0000

awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

MIT000

Awesome-LLMs-Datasets

Summarize existing representative LLMs text datasets.

Apache-2.0000

Sora AI Awesome List – Your go-to resource hub for all things Sora AI, OpenAI's groundbreaking model for crafting realistic scenes from text. Explore a curated collection of articles, videos, podcasts, and news about Sora's capabilities, advancements, and more.

Apache-2.0000

Bunny

A family of lightweight multimodal models.

Apache-2.0000

ConsistI2V

ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation

MIT000

FRESCO

[CVPR 2024] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation

NOASSERTION000

GenTranslate

Code for paper "GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators"

Apache-2.0000

i-Code

MIT000

LangSegment

It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言（97种语言）混合文本内容自动分词工具。

Language:Python000

languagecodec

Official code repository of Language-Codec

MIT000

lina-speech

lina-speech : linear attention based text-to-speech

NOASSERTION000

llava-phi

000

M2UGen

This is the official repository for M2UGen

MIT000

MahaTTS

Language:PythonApache-2.0000

metavoice-src

Foundational model for human-like, expressive TTS

Apache-2.0000

Open-Sora

Building your own video generation model like OpenAI's Sora

Apache-2.0000

pyannote-whisper

000

pytorch-speech-features

NOASSERTION000

pyvideotrans

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，并添加配音

GPL-3.0000

snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

MIT000

so-vits-models

收集有关so-vits-svc、TTS、SD、LLMs的各种模型、应用以及文字、声音、图片、视频有关的model。

MIT000

SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

000

transcribe-anything

Input a local file or url and this service will transcribe it using Whisper AI. Completely private and Free 🤯🤯🤯

Language:PythonMIT000

tts-qa

000

VoiceCraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

NOASSERTION000

youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

MIT000

WhiteFu

WhiteFu's repositories

llm-paper-daily

ai-audio-startups

audio-pipeline

AudioEditingCode

awesome-audio-plaza

Awesome-LLMs-Datasets

Awesome-Open-AI-Sora

Bunny

ConsistI2V

FRESCO

GenTranslate

i-Code

LangSegment

languagecodec

lina-speech

llava-phi

M2UGen

MahaTTS

metavoice-src

Open-Sora

pyannote-whisper

pytorch-speech-features

pyvideotrans

snac

so-vits-models

SoraReview

transcribe-anything

tts-qa

VoiceCraft

youtube-transcript-api