HUJA9

HUJA9

Geek Repo

Github PK Tool:Github PK Tool

HUJA9's starred repositories

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4381Issues:0Issues:0

Bag-of-Tricks-for-AT

Empirical tricks for training robust models (ICLR 2021)

Language:PythonLicense:Apache-2.0Stargazers:248Issues:0Issues:0

aligner

Achieving Efficient Alignment through Learned Correction

Language:PythonStargazers:99Issues:0Issues:0

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:12714Issues:0Issues:0

lime

Lime: Explaining the predictions of any machine learning classifier

Language:JavaScriptLicense:BSD-2-ClauseStargazers:11466Issues:0Issues:0

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonLicense:Apache-2.0Stargazers:1282Issues:0Issues:0

OpenBackdoor

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

Language:PythonLicense:Apache-2.0Stargazers:145Issues:0Issues:0
Language:Jupyter NotebookLicense:NOASSERTIONStargazers:370Issues:0Issues:0

PromptAttack

An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)

Language:PythonStargazers:38Issues:0Issues:0

llm-attacks

Universal and Transferable Attacks on Aligned Language Models

Language:PythonLicense:MITStargazers:3205Issues:0Issues:0

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonLicense:Apache-2.0Stargazers:4708Issues:0Issues:0

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookStargazers:11398Issues:0Issues:0

LLMs-Finetuning-Safety

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

Language:PythonLicense:MITStargazers:212Issues:0Issues:0

LLM-Conversation-Safety

[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey

Stargazers:63Issues:0Issues:0
Language:PythonStargazers:11Issues:0Issues:0

LM-exp

LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces

Language:Jupyter NotebookStargazers:71Issues:0Issues:0

swa_gaussian

Code repo for "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:437Issues:0Issues:0

transferlearning

Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习

Language:PythonLicense:MITStargazers:13188Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:36233Issues:0Issues:0

corr2cause

Data and code for the Corr2Cause paper (ICLR 2024)

Language:PythonLicense:MITStargazers:79Issues:0Issues:0

Causality4NLP_Papers

A reading list for papers on causality for natural language processing (NLP)

Stargazers:480Issues:0Issues:0

causal-text-papers

Curated research at the intersection of causal inference and natural language processing.

Stargazers:770Issues:0Issues:0

causalml

Uplift modeling and causal inference with machine learning algorithms

Language:PythonLicense:NOASSERTIONStargazers:4945Issues:0Issues:0

dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Language:PythonLicense:MITStargazers:6966Issues:0Issues:0

Awesome-LLM-Interpretability

A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..

Stargazers:99Issues:0Issues:0

DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

Language:PythonStargazers:391Issues:0Issues:0

DecodingTrust

A Comprehensive Assessment of Trustworthiness in GPT Models

Language:PythonLicense:CC-BY-SA-4.0Stargazers:240Issues:0Issues:0

TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Language:PythonLicense:MITStargazers:397Issues:0Issues:0

awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)

Stargazers:1057Issues:0Issues:0

natural-instructions

Expanding natural instructions

Language:PythonLicense:Apache-2.0Stargazers:934Issues:0Issues:0