TissueC

Hao's starred repositories

PaperMemory

Your browser's reference manager: automatic paper detection (Arxiv, OpenReview & more), publication venue matching and code repository discovery! Also enhances ArXiv: BibTex citation, Markdown link, direct download and more!

Language:JavaScriptMIT47800

GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Language:PythonApache-2.0389900

MAP-NEO

Language:Python77500

lighteval

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.

Language:PythonMIT49500

elasticsearch

Free and Open, Distributed, RESTful Search Engine

Language:JavaNOASSERTION6871100

LLM-Extrapolation

Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"

Language:Python5600

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonMIT190000

mup

maximal update parametrization (µP)

Language:Jupyter NotebookMIT125500

fm-cheatsheet

Website for hosting the Open Foundation Models Cheat Sheet.

Language:JavaScript25100

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell649800

outlines

Structured Text Generation

Language:PythonApache-2.0732500

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonApache-2.0424800

MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.

Language:PythonApache-2.0446000

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonApache-2.081000

InfoBench

Language:PythonMIT4100

DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Language:PythonMIT93100

COIG-CQIA

7100

FollowBench

Code for "FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models (ACL 2024)"

Language:PythonApache-2.06200

Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Language:PythonApache-2.057900

CritiqueLLM

Language:Python10800

mergekit

Tools for merging pretrained large language models.

Language:PythonLGPL-3.0417200

Yuan-2.0

Yuan 2.0 Large Language Model

Language:PythonNOASSERTION67200

tensor_parallel

Automatically split your PyTorch models on multiple GPUs for training & inference

Language:PythonMIT60400

BPO

Language:PythonApache-2.027100

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonApache-2.0339200

LongQLoRA

LongQLoRA: Extent Context Length of LLMs Efficiently

Language:Python15200

DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself

Language:PythonMIT617000

Yi

A series of large language models trained from scratch by developers @01-ai

Language:PythonApache-2.0750600

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Language:PythonApache-2.03223200

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.0568100