xiaokc's starred repositories
stable-diffusion
A latent text-to-image diffusion model
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
nlp_course
YSDA course in Natural Language Processing
nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
CLUEDatasetSearch
搜索所有中文NLP数据集,附常用英文NLP数据集
llm-attacks
Universal and Transferable Attacks on Aligned Language Models
DeepClustering
Methods and Implements of Deep Clustering
baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
data-augmentation-review
List of useful data augmentation resources. You will find here some not common techniques, libraries, links to GitHub repos, papers, and others.
OpenNMT-tf
Neural machine translation and sequence learning using TensorFlow
Awesome-LLM-Safety
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights into the safety implications, challenges, and advancements surrounding these powerful models.
Safety-Prompts
Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。
DataAug4NLP
Collection of papers and resources for data augmentation for NLP.
Prompt-BERT
PromptBERT: Improving BERT Sentence Embeddings with Prompts
Contrastive-Clustering
Code for the paper "Contrastive Clustering" (AAAI 2021)
siamese-pytorch
Implementation of Siamese Networks for image one-shot learning by PyTorch, train and test model on dataset Omniglot
awesome-neural-adaptation-in-NLP
Awesome Neural Adaptation in Natural Language Processing. A curated list. https://arxiv.org/abs/2006.00632
COLDataset
The official repository of the paper: COLD: A Benchmark for Chinese Offensive Language Detection
SafeDecoding
Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
KeywordProcesser
使用python实现了一个简单的trie树结构,可增加/查找/删除关键词,用于中文文本的关键词匹配、停用词删除等。
vae_for_text
Tensorflow implementation of Generating Sentences from a Continuous Space