Chong Chen's starred repositories
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
ce_pretrain
预训练中英文混合bert模型
zuowen-dataset-pt1
:paper: 作文数据集 - 第 1 部分
haystack-search-engine
A Semantic Search Engine Built on Arxiv dataset from Kaggle.
haystack
:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
contriever
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
Fengshenbang-LM
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
awesome-machine-unlearning
Awesome Machine Unlearning (A Survey of Machine Unlearning)
PromptPapers
Must-read papers on prompt-based tuning for pre-trained language models.