Yingfei(Jeremy) Xiang's repositories
Cherry_LLM
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
deep-language-networks
We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a Deep Language Network - DLN
devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
faiss_tips
Some useful tips for faiss
leetcode-hard-gym
A hard gym for programming
List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
LLM-Shearing
Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
PentestGPT
A GPT-empowered penetration testing tool
sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
small-LMs-Task-Planning
Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning
SupContrast
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)