Xuanqi Gao's starred repositories
chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
deeplearning-models
A collection of various deep learning architectures, models, and tips
ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
chatgpt_system_prompt
A collection of GPT system prompts and various prompt injection/leaking knowledge.
chinese-chatbot-corpus
中文公开聊天语料库
covid-chestxray-dataset
We are building an open database of COVID-19 cases with chest X-ray or CT images.
MT-Reading-List
A machine translation reading list maintained by Tsinghua Natural Language Processing Group
scattertext
Beautiful visualizations of how language differs among document types.
ABigSurvey
A collection of 1000+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).
Awesome-Code-LLM
A curated list of language modeling researches for code and related datasets.
MultiTurnResponseSelection
This repo contains our ACL 2017 paper data and source code
Awesome-LM-SSP
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
covid-19-open-data
Datasets of daily time-series data related to COVID-19 for over 20,000 distinct locations around the world.
gptstore-prompts
Here are the Top 100 prompts on GPTStore, which we can use to learn and improve prompt engineering.
diffusion-nlp-paper-arxiv
Auto get diffusion nlp papers in Axriv. More papers Information can be found in another repository "Diffusion_NLP_Papers".
Open-Source-Quartz-Solar-Forecast
Open Source Solar Site Level Forecast
faststylometry
Stylometry library for Burrows' Delta method
Neural_Network_Pruning
Implementations of different neural network pruning techniques
translation_bias
Data and notebook for experiments in Hovy et al. (2020) https://www.aclweb.org/anthology/2020.acl-main.154.pdf