Men Tianyi's repositories
abstract-state-seqmodel
Code for EMNLP 2023 paper "Emergence of Abstract State Representations in Embodied Sequence Modeling"
agent-attack
[Arxiv 2024] Adversarial Attacks on Multimodal Agents
Agent-Smith
[ICML2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
alignment-handbook
Robust recipes for to align language models with human and AI preferences
AutoDroid
Source code for the paper "Empowering LLM to use Smartphone for Intelligent Task Automation"
babyai
BabyAI platform. A testbed for training agents to understand and execute language commands.
CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
gpt_academic
为ChatGPT/GLM提供图形交互界面,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持清华chatglm2等本地模型。兼容复旦MOSS, llama, rwkv, newbing, claude, claude2等
gym-cooking
gym-cooking: Code for "Too many cooks: Bayesian inference for coordinating multi-agent collaboration", Winner of the CogSci 2020 Computational Modeling Prize in High Cognition, and a NeurIPS 2020 CoopAI Workshop Best Paper.
LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
SmartPlay
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. SmartPlay is designed to be easy to use, and to support future development of LLMs.
Synapse
Trajectory-as-Exemplar Prompting with Memory for Computer Control
gym
A toolkit for developing and comparing reinforcement learning algorithms.
gym-minigrid
Minimalistic gridworld package for OpenAI Gym
label-words-are-anchors
Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
llm-reasoners
A library for advanced large language model reasoning
llm-transparency-tool
LLM Transparency Tool (LLM-TT), an open-source interactive toolkit for analyzing internal workings of Transformer-based language models. *Check out demo at* https://huggingface.co/spaces/facebook/llm-transparency-tool-demo
lm-arithmetic
Code for the paper "A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis"
othello_world
Emergent world representations: Exploring a sequence model trained on a synthetic task
pyvene
Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions
toolbench
ToolBench, an evaluation suite for LLM tool manipulation capabilities.
ToRA
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools.
tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
WebShop
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents