Fangkai Jiao (SparkJiao)

SparkJiao

Geek Repo

Company:NTU-NLP & I2R, A*STAR, Singapore

Location:Sinagpore

Home Page:jiaofangkai.com

Github PK Tool:Github PK Tool

Fangkai Jiao's starred repositories

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49195Issues:561Issues:202

OpenDevin

šŸš OpenDevin: Code Less, Make More

Language:PythonLicense:MITStargazers:28872Issues:276Issues:1189

dspy

DSPy: The framework for programmingā€”not promptingā€”foundation models

Language:PythonLicense:MITStargazers:14762Issues:129Issues:611

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5386Issues:64Issues:96

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5189Issues:38Issues:37

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonLicense:Apache-2.0Stargazers:4246Issues:42Issues:175

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4164Issues:47Issues:261

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1896Issues:18Issues:43

SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?

Language:PythonLicense:MITStargazers:1523Issues:22Issues:115

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Language:PythonLicense:Apache-2.0Stargazers:1280Issues:17Issues:46

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonLicense:Apache-2.0Stargazers:995Issues:40Issues:66

DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Language:PythonLicense:MITStargazers:930Issues:15Issues:35

ao

Custom data types and layouts for training and inference

Language:PythonLicense:BSD-3-ClauseStargazers:434Issues:25Issues:89

apps

APPS: Automated Programming Progress Standard (NeurIPS 2021)

Language:PythonLicense:MITStargazers:377Issues:13Issues:27

ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Language:PythonLicense:Apache-2.0Stargazers:301Issues:7Issues:20

zero-bubble-pipeline-parallelism

Zero Bubble Pipeline Parallelism

Language:PythonLicense:NOASSERTIONStargazers:231Issues:6Issues:23

HallusionBench

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Language:PythonLicense:BSD-3-ClauseStargazers:205Issues:4Issues:11

DenseSSM

A repository for DenseSSMs

Language:PythonLicense:Apache-2.0Stargazers:84Issues:2Issues:3

Inflection-Benchmarks

Public Inflection Benchmarks

llm-planning-eval

Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"

LLMSanitize

An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs).

dpo-trajectory-reasoning

Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".

Language:PythonStargazers:19Issues:2Issues:0

SeaEval

NAACL 2024: SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning

Language:PythonLicense:NOASSERTIONStargazers:18Issues:0Issues:2

RLMEC

The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"

UNK-VQA

A VQA dataset that includes unanswerable questions.

License:Apache-2.0Stargazers:1Issues:1Issues:0