hzhang57's starred repositories

LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Language:PythonLicense:Apache-2.0Stargazers:23730Issues:161Issues:3715

ChatPaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Language:PythonLicense:NOASSERTIONStargazers:17824Issues:88Issues:214

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:14487Issues:107Issues:920

ImageBind

ImageBind One Embedding Space to Bind Them All

Language:PythonLicense:NOASSERTIONStargazers:7976Issues:100Issues:83

openchat

OpenChat: Advancing Open-source Language Models with Imperfect Data

Language:PythonLicense:Apache-2.0Stargazers:5070Issues:51Issues:179

Augmentor

Image augmentation library in Python for machine learning.

Language:PythonLicense:MITStargazers:5038Issues:125Issues:194

ChatGLM-Efficient-Tuning

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

Language:PythonLicense:Apache-2.0Stargazers:3616Issues:32Issues:374

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonLicense:Apache-2.0Stargazers:2905Issues:22Issues:370

chain-of-thought-hub

Benchmarking large language models' complex reasoning ability with chain-of-thought prompting

Language:Jupyter NotebookLicense:MITStargazers:2409Issues:38Issues:34

Caption-Anything

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything

Language:PythonLicense:BSD-3-ClauseStargazers:1617Issues:15Issues:21

efficientvit

EfficientViT is a new family of vision models for efficient high-resolution vision.

Language:PythonLicense:Apache-2.0Stargazers:1517Issues:31Issues:109

GPT-vup

GPT-vup BIliBili | 抖音 | AI | 虚拟主播

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:1005Issues:14Issues:102

multimodal-maestro

Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥

Language:PythonLicense:MITStargazers:971Issues:14Issues:7

llm-reasoners

A library for advanced large language model reasoning

Language:PythonLicense:Apache-2.0Stargazers:774Issues:14Issues:19

GenossGPT

One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace ...) 🌈🐂 Replace OpenAI GPT with any LLMs in your app with one line.

docker-llama2-chat

Play LLaMA2 (official / 中文版 / INT4 / llama2.cpp) Together! ONLY 3 STEPS! ( non GPU / 5GB vRAM / 8~14GB vRAM)

Language:PythonLicense:Apache-2.0Stargazers:523Issues:6Issues:22

lightning-sam

Fine-tune Segment-Anything Model with Lightning Fabric.

Language:PythonLicense:Apache-2.0Stargazers:439Issues:12Issues:47

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

Language:PythonLicense:NOASSERTIONStargazers:347Issues:21Issues:10

lens

This is the official repository for the LENS (Large Language Models Enhanced to See) system.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:340Issues:8Issues:3

LoRA-ViT

Low rank adaptation for Vision Transformer

Language:PythonLicense:GPL-3.0Stargazers:305Issues:4Issues:5

LaCLIP

[NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"

Language:PythonLicense:BSD-2-ClauseStargazers:236Issues:8Issues:8

SeViLA

[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering

Language:PythonLicense:BSD-3-ClauseStargazers:165Issues:3Issues:24

ImageBind-LoRA

Fine-tuning "ImageBind One Embedding Space to Bind Them All" with LoRA

Language:PythonLicense:NOASSERTIONStargazers:160Issues:3Issues:9
Language:PythonLicense:MITStargazers:130Issues:2Issues:18

VL-CheckList

Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations.

Language:PythonLicense:Apache-2.0Stargazers:116Issues:5Issues:11

FunQA

FunQA benchmarks funny, creative, and magic videos for challenging tasks including timestamp localization, video description, reasoning, and beyond.

Language:PythonLicense:MITStargazers:91Issues:3Issues:7

Paper-Implementation-Template

A simple reproducible template to implement AI research papers

License:MITStargazers:21Issues:2Issues:0