RuanVisser's starred repositories
jailbreak_llms
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
BERT-related-papers
BERT-related papers
socialreaper
Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Indic-BERT-v1
Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
transformer-alignment
Code for EMNLP 2020 paper Accurate Word Alignment Induction from Neural Machine Translation
the-algorithm
Source code for Twitter's Recommendation Algorithm
the-algorithm-ml
Source code for Twitter's Recommendation Algorithm