coder4nlp's starred repositories
chinese-llm-benchmark
中文大模型能力评测榜单:目前已囊括106个大模型,覆盖chatgpt、gpt4o、百度文心一言、阿里通义千问、讯飞星火、商汤senseChat、minimax等商用模型, 以及百川、qwen2、glm4、yi、书生internLM2、llama3等开源大模型,多维度能力评测。不仅提供能力评分排行榜,也提供所有模型的原始输出结果!
Table-LLaVA
Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tabular MLLM named Table-LLaVA.
Open-LLaVA-NeXT
An open-source implementation for training LLaVA-NeXT.
Make-A-Scene
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
torchtitan
A native PyTorch Library for large model training
LiveSum-TTT
Codes and Datasets for the Paper: Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction
Vary-tiny-600k
Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)
prismatic-vlms
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
AdvancedLiterateMachinery
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Visual-CoT
Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
WSDM-Cup-2024
1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc