Support large diffs

Question

Support large diffs

sunscreem opened this issue a year ago · comments

Bug description

When attempting to use aicommit to generate a commit message for some changes yesterday I got the following error:

│
└  ✖ OpenAI API Error: 400 - Bad Request

{
  "error": {
    "message": "This model's maximum context length is 4097 tokens. However, your messages resulted in 8332 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

The total size of all my changes: Showing 13 changed files with 301 additions and 17 deletions.

This is a fairly normal size commit so I wouldn't have expect to receive that error. Maybe I'm not understanding how this plugin works.

aicommits version

1.11.0

Environment

System:
    OS: Linux 4.15 Ubuntu 18.04.6 LTS (Bionic Beaver)
    CPU: (4) x64 DO-Regular
    Memory: 4.62 GB / 7.79 GB
    Container: Yes
    Shell: 4.4.20 - /bin/bash
  Binaries:
    Node: 16.16.0 - /usr/local/bin/node
    Yarn: 1.19.1 - /usr/bin/yarn
    npm: 8.11.0 - /usr/local/bin/npm

Can you contribute a fix?

I’m interested in opening a pull request for this issue.

Eduar Bastidas · Answer 1 · Sat May 06 2023 05:51:03 GMT+0800 (China Standard Time)

Something similar happens when using gpt-4:

│
└  ✖ OpenAI API Error: 400 - Bad Request

{
  "error": {
    "message": "This model's maximum context length is 8192 tokens. However, your messages resulted in 10456 tokens. Please reduce the length of the messages.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

It is a limitation of the OpenAI API itself, not of aicommits, but it could be solved by sending fragmented requests.

Hiroki Osame · Answer 2 · Mon May 08 2023 11:53:06 GMT+0800 (China Standard Time)

This isn't a bug—it's an OpenAI limitation. Changing to a feature request.

@mreduar

It is a limitation of the OpenAI API itself, not of aicommits, but it could be solved by sending fragmented requests.

Would that even work? The API is stateless so we'd have to send the conversation history, which counts against the tokens.

I think the only solution is to summarize each file diff, and then generate a commit from the summaries. But I think that may cost too much (both financially and waiting time for multiple requests).

Open to accepting feature requests for:

Improving the error message to recommend smaller diffs
Summarizing each file diff before generating a commit message

Robert Cooper · Answer 3 · Mon May 08 2023 19:53:02 GMT+0800 (China Standard Time)

My suggestion would be aicommits checks the diff size.

If it's 'too big' it prompts for a commit message instead.

Eduar Bastidas · Answer 4 · Mon May 08 2023 21:06:59 GMT+0800 (China Standard Time)

Here is an example of how such a case could be handled.

Wulu · Answer 5 · Thu Aug 03 2023 18:21:45 GMT+0800 (China Standard Time)

We can temporarily use gpt-3.5-turbo-16k-0613 to solve this problem, which supports a larger token capacity. However, for a large language model, no matter how many tokens it supports, there will always be an upper limit.

Open to accepting feature requests for:

* Improving the error message to recommend smaller diffs

* Summarizing each file diff before generating a commit message

I think this will be a good idea.

Sunghyun Cho · Answer 6 · Mon Sep 25 2023 11:06:28 GMT+0800 (China Standard Time)

Also, putting Map-Reduce Summary Strategy Here. This method has no limitations.