mazicwong

followers

following

stars

PKU

China

Organizations

jnutxdy

Zhiqi Huang's starred repositories

OSC8-Adoption

List of terminal emulators that support hyperlinks (OSC 8 escape sequences).

Language:MarkdownCC0-1.08900

LLaVA-HR

LLaVA-HR: High-Resolution Large Language-Vision Assistant

Language:PythonApache-2.017700

mmc4

MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.

Language:PythonMIT87600

efficientvit

EfficientViT is a new family of vision models for efficient high-resolution vision.

Language:PythonApache-2.0152800

TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

Language:PythonApache-2.041500

pillow-simd

The friendly PIL fork

Language:PythonNOASSERTION210200

LLaVA-JP

LLaVA-JP is a Japanese VLM trained by LLaVA method

Language:PythonApache-2.03600

Long-CLIP

Language:Python36600

awesome-audio-plaza

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

MIT21100

rho

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

MIT26700

RapidLaTeXOCR

Formula recognition based on LaTeX-OCR and ONNXRuntime.

Language:PythonMIT23800

moviepy

Video editing with Python

Language:PythonMIT1198500

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonApache-2.049300

shap

A game theoretic approach to explain the output of any machine learning model.

Language:Jupyter NotebookMIT2193700

aider

aider is AI pair programming in your terminal

Language:PythonApache-2.01144500

Awesome-Pruning

A curated list of neural network pruning resources.

attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Language:PythonMIT41100

GitHub-Chinese-Top-Charts

:cn: GitHub中文排行榜，各语言分设「软件 | 资料」榜单，精准定位中文好项目。各取所需，高效学习。

Language:JavaNOASSERTION9233000

ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Language:PythonApache-2.019600

InternVideo2

MIT17000

grok-1

Grok open release

Language:PythonApache-2.04901100

Large-Audio-Models

Keep track of big models in audio domain, including speech, singing, music etc.

RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Language:PythonNOASSERTION10600

OpenWebMath

Language:XSLTApache-2.08700

kenlm

KenLM: Faster and Smaller Language Model Queries

Language:C++NOASSERTION242700

transformer-debugger

Language:PythonMIT393200

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT1997200

Languagecodec

Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models

Language:PythonMIT17100

Phi2-mini-Chinese

Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型，支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.

Language:Jupyter NotebookApache-2.040900

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT60500