Jinfeng Li's starred repositories
exaggerated-safety
Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"
langkit
🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring safety & security. 🛡️ Features include text quality, relevance metrics, & sentiment analysis. 📊 A comprehensive tool for LLM observability. 👀
TextualAdversarialAttack-Tianchi
天池竞赛安全AI挑战者计划第三期 - 文本分类对抗攻击 线上排名12/1175 &“最佳奇思妙想奖”
nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
LLM-Multistep-Jailbreak
Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT
jailbreak_llms
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
BERT-Multitask-learning
Multitask-learning of a BERT backbone. Allows to easily train a BERT model with state-of-the-art method such as PCGrad, Gradient Vaccine, PALs, Scheduling, Class imbalance handling and many optimizations
transformers_tasks
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.
pytorch-nlp-multitask
A simple project training 3 separate NLP tasks simultaneously using Multitask-Learning
bert-multitask-learning
BERT for Multitask Learning
opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
document-level-classification
超长文本分类(大于1000字);文档级/篇章级文本分类;主要是解决长距离依赖问题
python-doc
translate python documents to Chinese for convenient reference 简而言之,这里用来存放那些Python文档君们,并且尽力将其翻译成中文~~
LLMsPracticalGuide
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
auto-redteam
Redteaming LLMs using other LLMs