princeton-nlp / SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1 minute to run.

https://princeton-nlp.github.io/SWE-agent/

Catch errors/warnings in test executions

klieret opened this issue 2 months ago · comments

Kilian Lieret commented 2 months ago

We should check that we don't see any warnings/errors/critical in the execution of any of the tests