SparkJiao

followers

following

stars

NTU-NLP & I2R, A*STAR, Singapore

Sinagpore

jiaofangkai.com

Fangkai Jiao's starred repositories

grok-1

Grok open release

Language:PythonApache-2.049195 561 202

OpenDevin

🐚 OpenDevin: Code Less, Make More

Language:PythonMIT28872 276 1189

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonMIT14762 129 611

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell6484 40 678

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonBSD-3-Clause5386 64 96

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonApache-2.05189 38 37

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonApache-2.04246 42 175

mergekit

Tools for merging pretrained large language models.

Language:PythonLGPL-3.04164 47 261

transformer-debugger

Language:PythonMIT3983 26 14

code_contests

Language:C++Apache-2.02037 39 35

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonMIT1896 18 43

SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?

Language:PythonMIT1523 22 115

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Language:PythonApache-2.01280 17 46

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonApache-2.0995 40 66

DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Language:PythonMIT930 15 35

ao

Custom data types and layouts for training and inference

Language:PythonBSD-3-Clause434 25 89

apps

APPS: Automated Programming Progress Standard (NeurIPS 2021)

Language:PythonMIT377 13 27

ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Language:PythonApache-2.0301 7 20

zero-bubble-pipeline-parallelism

Zero Bubble Pipeline Parallelism

Language:PythonNOASSERTION231 6 23

HallusionBench

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Language:PythonBSD-3-Clause205 4 11

DenseSSM

A repository for DenseSSMs

Language:Python85 3 2

RefGPT

Language:PythonApache-2.084 2 3

Inflection-Benchmarks

Public Inflection Benchmarks

MIT67 6 4

llm-planning-eval

Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"

Language:Python43 4 2

LLMSanitize

An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs).

Language:Python36 3 2

dpo-trajectory-reasoning

Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".

Language:Python19 20

SeaEval

NAACL 2024: SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning

Language:PythonNOASSERTION1802

SimulateBench

GPT as Human

Language:Python17 2 4

RLMEC

The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"

Language:Python8 2 4

UNK-VQA

A VQA dataset that includes unanswerable questions.

Apache-2.01 10