Yingfei(Jeremy) Xiang's repositories
Cherry_LLM
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
leetcode-hard-gym
A hard gym for programming
List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
PentestGPT
A GPT-empowered penetration testing tool
simple-one-api
OpenAI 接口接入适配,支持千帆大模型平台、讯飞星火大模型、腾讯混元以及MiniMax、Deep-Seek,等兼容OpenAI接口,仅单可执行文件,配置超级简单,一键部署,开箱即用.
small-LMs-Task-Planning
Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning