Beast code in Giters

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

Language:Jupyter NotebookMIT20513 860 155

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.019791 156 1505

deepmind-research

This repository contains implementations and illustrative code to accompany DeepMind publications

Language:Jupyter NotebookApache-2.013148 325 321

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

12251 271 115

unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language:HTMLApache-2.08836 57 1117

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.08742 134 1093

LWM

Large World Model With 1M Context

Language:PythonApache-2.07115 66 71

Baichuan-7B

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

Language:PythonApache-2.05669 67 128

mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Language:PythonNOASSERTION5489 114 656

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION4937 49 442

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonMIT4529 59 156

ms-swift

Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)

Language:PythonApache-2.03891 22 1179

NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Language:PythonBSD-3-Clause3254 57 101

MAE-pytorch

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Language:Python2593 24 96

i-Code

Language:Jupyter NotebookMIT1670 40 74

DI-star

An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.

Language:PythonApache-2.01221 18 26

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Language:PythonApache-2.01018 26 49

Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Language:CApache-2.0849 8 90

shiyuzh2007

shiyuzh2007's starred repositories

stable-diffusion-webui

pytorch

whisper

stable-diffusion

llama

bark

Open-Sora

reinforcement-learning