Bibek's repositories
llm-latent-language
Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
-Learning-Interpretability-Tool
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
awesome-hallucination-detection
List of papers on hallucination detection in LLMs.
Awesome-LLMs-Evaluation-Papers
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
character_AI_open
Generate multi-round conversation roleplay data based on self-instruct, about 1k different personality data and conversations
ChatGPT-Jailbreaking
ChatGPT jailbreaking and vulnerabilities when prompted in multiple languages.
ecco
Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
llama-viz
The attention map viewer for LLaMA models.
LLM-Conversation-Safety
[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
LM-experiment
LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces
mlx-examples
Examples in the MLX framework
multilingual-safety-for-LLMs
Data for "Multilingual Jailbreak Challenges in Large Language Models"
ongdb-graph-db
ONgDB is an independent fork of Neo4j® Enterprise Edition version 3.4.0.rc02 licensed under AGPLv3 and/or Community Edition licensed under GPLv3
promptbench
A unified evaluation framework for large language models
pytorch-llama
LLaMA 2 implemented from scratch in PyTorch
Reverse-Engineering-Tools-for-Large-Language-Models
RevLLM -- Reverse Engineering Tools for Large Language Models
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
TransformerLens
A library for mechanistic interpretability of GPT-style language models
uncertain_ground_truth_ddx_dermatology
Dermatology ddx dataset, Jax implementations of Monte Carlo conformal prediction, plausibility regions and statistical annotation aggregation from our recent work on uncertain ground truth (TMLR'23 and ArXiv pre-print).
wordset-dictionary
The Open Source Dictionary
yet-another-applied-llm-benchmark
A benchmark to evaluate language models on questions I've previously asked them to solve.