Beast code in Giters

AliceShen122's starred repositories

SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Language:PythonMIT118200

Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

Apache-2.0476900

SuperPrompt

SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.

461000

mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Language:PythonMIT294600

VideoSys

VideoSys: An easy and efficient system for video generation

Language:PythonApache-2.0171700

VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Language:PythonNOASSERTION452200

text-to-video-synthesis-colab

Text To Video Synthesis Colab

Language:Jupyter NotebookUnlicense145200

Hotshot-XL

✨ Hotshot-XL: State-of-the-art AI text-to-GIF model trained to work alongside Stable Diffusion XL

Language:PythonApache-2.0105100

Show-1

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Language:PythonNOASSERTION109900

generative-models

Generative Models by Stability AI

Language:PythonMIT2440700

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT1141000

Text2Video-Zero

[ICCV 2023 Oral] Text-to-Image Diffusion Models are Zero-Shot Video Generators

Language:PythonNOASSERTION401700

CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Language:PythonApache-2.0815800

athena

an open-source implementation of sequence-to-sequence based speech processing engine

Language:C++Apache-2.095300

SenseVoice

Multilingual Voice Understanding Model

Language:PythonNOASSERTION309800

ChatTTS

A generative speech model for daily dialogue.

Language:PythonAGPL-3.03165700

fish-speech

Brand new TTS solution

Language:PythonNOASSERTION1347700

ailia-models

The collection of pre-trained, state-of-the-art AI models for ailia SDK

Language:Python201900

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT3440000

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT347400

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonMIT116700

jailbreak_llms

[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).

Language:Jupyter NotebookMIT262400

VPTQ

VPTQ, A Flexible and Extreme low-bit quantization algorithm

Language:PythonMIT42800

MemGPT

Letta (fka MemGPT) is a framework for creating stateful LLM services.

Language:PythonApache-2.01199600

self-llm

《开源大模型食用指南》基于Linux环境快速部署开源大模型，更适合**宝宝的部署教程

Language:Jupyter NotebookApache-2.0851200

RAGChecker

RAGChecker: A Fine-grained Framework For Diagnosing RAG

Language:PythonApache-2.045000

label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

Language:JavaScriptApache-2.01891500

ragas

Supercharge Your LLM Application Evaluations 🚀

Language:PythonApache-2.0696300

MMA-Diffusion

[CVPR2024] MMA-Diffusion: MultiModal Attack on Diffusion Models

Language:PythonNOASSERTION14000

LAION-SAFETY

An open toolbox for NSFW & toxicity detection

Language:Jupyter Notebook4900