apartresearch

apartresearch

Geek Repo

Artificial intelligence will change the world. Our mission is to ensure this happens safely and to the benefit of everyone.

Home Page:https://apartresearch.com

Twitter:@apartresearch

Github PK Tool:Github PK Tool

apartresearch's repositories

interpretability-starter

🧠 Starter templates for doing interpretability research

Stargazers:53Issues:0Issues:0

specificityplus

πŸ‘©β€πŸ’» Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"

Language:PythonLicense:NOASSERTIONStargazers:18Issues:2Issues:24

Neuron2Graph

Tools for exploring Transformer neuron behaviour, including input pruning and diversification.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:16Issues:2Issues:1

readingwhatwecan

πŸ“šπŸ“šπŸ“šπŸ“šπŸ“šπŸ“šπŸ“šπŸ“šπŸ“š Reading everything

Language:CSSStargazers:12Issues:0Issues:0

Integer_Addition

✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks

Language:Jupyter NotebookLicense:MITStargazers:10Issues:3Issues:1

aisafetyideas

πŸ’‘ The web app CI/CD for aisafetyideas.com

Language:SvelteStargazers:8Issues:0Issues:35

deepdecipher

🦠 DeepDecipher: An open source API to MLP neurons

Language:RustLicense:MITStargazers:8Issues:1Issues:101

evaluations-starter

How to get started in evaluations and demonstrations research for dangerous capabilities

ai-psychology-starter

Code templates to get started as an AI psychologist

Language:Jupyter NotebookStargazers:4Issues:1Issues:0

mechanisticinterpretability

A repository for awesome resources in mechanistic interpretability

Stargazers:3Issues:0Issues:0

AIS-cost-effectiveness

Cost-effectiveness models, tools, and results for various AI safety field-building programs.

Language:PythonLicense:MITStargazers:2Issues:0Issues:0

othelloscope

Interpretability Hackathon 2.0 entry

Language:Jupyter NotebookLicense:MITStargazers:2Issues:0Issues:3

scheduling-widget

πŸ“† Showcases specific times in local time zones

Language:HTMLStargazers:2Issues:1Issues:0

blackbox-psych

Conducting psychology experiments on black box language models

Language:HTMLStargazers:1Issues:1Issues:0

empathetic-ai

πŸ€– A systematic review on how to create empathetic AI

Language:TeXStargazers:1Issues:2Issues:0

ICML2024MI

🌍 Website for NeurIPS2023MI

Language:CSSStargazers:1Issues:2Issues:0

n2g

Tools for exploring Transformer neuron behaviour, including input pruning and diversification.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1Issues:0Issues:0

safety-timelines

πŸ“ˆ Research into when alignment is solved

Language:RStargazers:1Issues:1Issues:0

scale-llm-24

🌍 Website for the Scaling Laws workshop

Language:CSSStargazers:1Issues:0Issues:0

seqcont_circuits

✱ Interpreting how similar sequence continuation tasks share internal representations ✱

Language:Jupyter NotebookLicense:MITStargazers:1Issues:0Issues:0

task-standard

🚨 METR Task Standard fork for the Code Red Hackathon

Language:TypeScriptStargazers:1Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

GPT-4-Chat-UI

GPT-4 frontend with open source Next.js template.

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0

hackathon-utils

😎 Code to run hackathons efficiently

License:MITStargazers:0Issues:1Issues:0

Interpreting-Reward-Models

✱ Interpreting implicit reward models learnt in RLHF using sparse autoencoders.

Language:Jupyter NotebookLicense:MITStargazers:0Issues:1Issues:7

open

🌍 Repository to update our open data

License:MITStargazers:0Issues:0Issues:0

paper-website

🌍 Website template for academic papers

Language:JavaScriptLicense:MITStargazers:0Issues:1Issues:0

town_hall_avatar

Uses ChatGPT to simulate a townhall discussion between avatars

Language:PythonStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:3Issues:0