shiyuzh2007's repositories

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

self-llm

《开源大模型食用指南》基于AutoDL快速部署开源大模型,更适合**宝宝的部署教程

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1Issues:0Issues:0

3D-Speaker

A repository for single- and multi-modal speaker verification, speaker recognition and speaker diarization.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Stargazers:0Issues:0Issues:0

audioldm_eval

This toolbox aims to unify audio generation model evaluation for easier comparison.

License:MITStargazers:0Issues:0Issues:0

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

License:MITStargazers:0Issues:0Issues:0

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

License:MITStargazers:0Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

Stargazers:0Issues:0Issues:0

Bert-VITS2

vits2 backbone with bert

License:AGPL-3.0Stargazers:0Issues:0Issues:0

FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.

License:NOASSERTIONStargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

langchain

⚡ Building applications with LLMs through composability ⚡

License:MITStargazers:0Issues:0Issues:0

langflow

⛓️ Langflow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.

License:MITStargazers:0Issues:0Issues:0

llama

Inference code for LLaMA models

License:NOASSERTIONStargazers:0Issues:0Issues:0

magvit

Official JAX implementation of MAGVIT: Masked Generative Video Transformer

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

OOTDiffusion

Official implementation of OOTDiffusion

License:NOASSERTIONStargazers:0Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

License:Apache-2.0Stargazers:0Issues:0Issues:0

PALM-E

Implementation of "PaLM-E: An Embodied Multimodal Language Model"

License:Apache-2.0Stargazers:0Issues:0Issues:0

ParroT

The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.

Stargazers:0Issues:0Issues:0

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

License:NOASSERTIONStargazers:0Issues:0Issues:0

safe-rlhf

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

License:Apache-2.0Stargazers:0Issues:0Issues:0

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

License:Apache-2.0Stargazers:0Issues:0Issues:0

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

License:NOASSERTIONStargazers:0Issues:0Issues:0

stable-diffusion-webui

Stable Diffusion web UI

License:AGPL-3.0Stargazers:0Issues:0Issues:0

unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

License:Apache-2.0Stargazers:0Issues:0Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

License:Apache-2.0Stargazers:0Issues:0Issues:0

Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

License:Apache-2.0Stargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0