SWE-bench's repositories
experiments
Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.
swe-bench.github.io
Landing page + leaderboard for SWE-Bench benchmark
000
humanevalfix-results
Evaluation data + results for SWE-agent inference on HumanEvalFix task