Improving resilience to inaccurate code generation

Question

Improving resilience to inaccurate code generation

alextrzyna opened this issue a year ago · comments

⚠️ Please check that this feature request hasn't been suggested before.

I searched previous Ideas in Discussions didn't find any similar feature requests.
I searched previous Issues didn't find any similar feature requests.

🔖 Feature description

First of all, really cool project! I found gpt-code-ui when looking for an alternative to Code Interpreter/Notable that I could run locally.

I have noticed that gpt-code-ui is not quite as resilient to mistakes that it makes when generating code, specifically when compared to something like ChatGPT + Noteable plugin. For example, if gpt-code-ui makes a mistaken assumption about the name of a dataframe row in code that it generates, execution will fail and it will give up, whereas in the Noteable scenario ChatGPT is more likely to proactively inspect the results and attempt to fix it.

✔️ Solution

Instead of just outputting the errors associated with a failed execution, proactively inspect the error and attempt a fix/re-run.

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this feature has not been requested yet.
I have provided enough information for the maintainers to understand and evaluate this request.

Rick Lamers · Answer 1 · Sat Jun 24 2023 15:36:59 GMT+0800 (China Standard Time)

Agree there is room for improvement in retrying/feeding error messages back into the model. Inviting the community to contribute PRs – it’s out of scope for what I wanted to build personally.

Maybe GPT-5 is good enough as to not hallucinate variable names? 😄

ciaran · Answer 2 · Mon Jul 10 2023 18:04:33 GMT+0800 (China Standard Time)

Although openai has now made the code interpreter available to all plus users, the project is still very cool, I have a question is it as powerful as the official plugin

darkacorn · Answer 3 · Tue Jul 11 2023 05:17:14 GMT+0800 (China Standard Time)

what kind of work would need to be done to run this on say a local llm with ooba ( has an openai compatible api)

Mathias Winkel · Answer 4 · Thu Jul 27 2023 03:24:14 GMT+0800 (China Standard Time)

Working on this one.
@ricklamers: I am preparing a pull request with a rough idea in https://github.com/dasmy/gpt-code-ui/tree/dev/conversation_history. Then we can discuss if and how my approach fits into the overall picture.

Ibrahim H. · Answer 5 · Sun Aug 27 2023 18:30:08 GMT+0800 (China Standard Time)

@dasmy I have two ideas in mind:

Detect if the code fails to execute statically (through the OS exit code, thrown exceptions, etc.).
Auto-detect and fix the issue, similar to the official CodeInterpreter implementation. The concept involves piping the output to ChatGPT with a simple prompt to identify the problem and attempt to resolve it. This approach may yield better results but could require more resources.
Ideally, we should ask the user to choose one of these options.