JechLee's starred repositories
Cherry_LLM
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
NeMo-Aligner
Scalable toolkit for efficient model alignment
Megatron-LM
Ongoing research training transformer models at scale
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
open-webui
User-friendly WebUI for LLMs (Formerly Ollama WebUI)
LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
LeetCode021
🚀 LeetCode From Zero To One & 题单整理 & 题解分享 & 算法模板 & 刷题路线,持续更新中...
transformers_tasks
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.
DeepLearing-Interview-Awesome-2024
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
awesome_LLMs_interview_notes
LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案
LiveSum-TTT
Codes and Datasets for the Paper: Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction
LLM-data-aug-survey
The official GitHub page for the survey paper "A Survey on Data Augmentation in Large Model Era"
SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
red-instruct
Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
LLMs-Finetuning-Safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
Safety-Evaluating
本文提出了一个基于“文心一言”的**LLMs的安全评估基准,其中包括8种典型的安全场景和6种指令攻击类型。此外,本文还提出了安全评估的框架和过程,利用手动编写和收集开源数据的测试Prompts,以及人工干预结合利用LLM强大的评估能力作为“共同评估者”。
CipherChat
A framework to evaluate the generalization capability of safety alignment for LLMs