koalazf99

followers

following

stars

Shanghai Jiao Tong University

Shanghai

koalazf99.github.io

Organizations

OpenLemur

Fan's starred repositories

LLM101n

LLM101n: Let's build a Storyteller

unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonApache-2.012347 87 569

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause11892 103 864

trafilatura

Python & command-line tool to gather text on the Web: Crawling & scraping, content extraction, metadata. TXT, Markdown, CSV & XML output.

Language:PythonApache-2.03189 30 336

mistral-finetune

Language:PythonApache-2.02444 31 23

gptpdf

Using GPT to parse PDF

Language:Python1972 9 14

magentic

Seamlessly integrate LLMs as Python functions

Language:PythonMIT1810 13 58

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonApache-2.01720 22 176

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonApache-2.01445 17 14

DeepSeek-Coder-V2

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.

Language:Jupyter NotebookMIT1223 12 22

aqt

Language:PythonApache-2.0216 6 25

magpie

Official repository for "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing"

Language:PythonMIT168 5 11

anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Language:Python15500

dclm

DataComp for Language Models

Language:HTMLMIT153 26 11

vscode-lean4

Visual Studio Code extension for the Lean 4 proof assistant

Language:TypeScriptApache-2.0137 13 193

bigcodebench

BigCodeBench: The Next Generation of HumanEval

Language:PythonApache-2.0116 5 9

OpenWebMath

Language:XSLTApache-2.095 3 2

DL4TP

A Survey on Deep Learning for Theorem Proving

MIT89 50

OlympicArena

This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"

Language:JavaScript6900

schedules-and-scaling

Language:PythonMIT39 40

k2-train

Language:PythonApache-2.02900

regmix

[arXiv 2024] RegMix: Data Mixture as Regression for Language Model Pre-training

Language:Jupyter NotebookMIT2800

agent-attack

[Arxiv 2024] Adversarial Attacks on Multimodal Agents

Language:PythonMIT2400

MoPS

[ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"

Language:Jupyter Notebook1900

Spider2-V

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Language:Jupyter NotebookApache-2.018 20

eval-arena

Language:Python15 20

remiss-jailbreak

Language:Python13 10

tpu_pod_commander

TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.

Language:PythonApache-2.0800

Awesome-DataCentric-LLM

trending projects & awesome papers about data-centric llm studies.

700