Shen Meng's starred repositories
Multi-Modality-Arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
ft-pali-gemma
Notebooks for fine tuning pali gemma
data_management_LLM
Collection of training data management explorations for large language models
fast-detect-gpt
Code base for "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature".
HallE_Control
HallE-Control: Controlling Object Hallucination in LMMs
Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
prismatic-vlms
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
vlm-evaluation
VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning
InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
sas-data-efficient-contrastive-learning
Official repository for SAS Data Efficient Contrastive Learning ICML '23