Xiaonan Li's starred repositories
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
DeepSeek-Coder
DeepSeek Coder: Let the Code Write Itself
FlagEmbedding
Retrieval and Retrieval-augmented LLMs
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
chatgpt-prompts-for-academic-writing
This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.
AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
awesome_LLMs_interview_notes
LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案
awesome-language-agents
List of language agents based on paper "Cognitive Architectures for Language Agents"
trainable-agents
Code and datasets for "Character-LLM: A Trainable Agent for Role-Playing"
LawCrimeMining
Law Crime Mining Based on Corpus build and content analysis by NLP methods. 基于领域语料库构建与NLP方法的裁判文书与犯罪案例文本挖掘项目
Everything-about-LLMs
A work in progress. Trying to write about all interesting or necessary pieces in the current development of LLMs and generative AI. Gradually adding more topics.
code-indexer-loop
Code Indexer Loop is a Python library for indexing and retrieving source code files through an integrated vector database that's continuously and efficiently updated.
kgi-slot-filling
This is the code for our KILT leaderboard submissions (KGI + Re2G models).
EnvInteractiveLMPapers
Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW(🏄).