Support large diffs
sunscreem opened this issue · comments
Bug description
When attempting to use aicommit to generate a commit message for some changes yesterday I got the following error:
│
└ ✖ OpenAI API Error: 400 - Bad Request
{
"error": {
"message": "This model's maximum context length is 4097 tokens. However, your messages resulted in 8332 tokens. Please reduce the length of the messages.",
"type": "invalid_request_error",
"param": "messages",
"code": "context_length_exceeded"
}
}
The total size of all my changes: Showing 13 changed files with 301 additions and 17 deletions.
This is a fairly normal size commit so I wouldn't have expect to receive that error. Maybe I'm not understanding how this plugin works.
aicommits version
1.11.0
Environment
System:
OS: Linux 4.15 Ubuntu 18.04.6 LTS (Bionic Beaver)
CPU: (4) x64 DO-Regular
Memory: 4.62 GB / 7.79 GB
Container: Yes
Shell: 4.4.20 - /bin/bash
Binaries:
Node: 16.16.0 - /usr/local/bin/node
Yarn: 1.19.1 - /usr/bin/yarn
npm: 8.11.0 - /usr/local/bin/npm
Can you contribute a fix?
- I’m interested in opening a pull request for this issue.
Something similar happens when using gpt-4:
│
└ ✖ OpenAI API Error: 400 - Bad Request
{
"error": {
"message": "This model's maximum context length is 8192 tokens. However, your messages resulted in 10456 tokens. Please reduce the length of the messages.",
"type": "invalid_request_error",
"param": "messages",
"code": "context_length_exceeded"
}
}
It is a limitation of the OpenAI API itself, not of aicommits, but it could be solved by sending fragmented requests.
This isn't a bug—it's an OpenAI limitation. Changing to a feature request.
It is a limitation of the OpenAI API itself, not of aicommits, but it could be solved by sending fragmented requests.
Would that even work? The API is stateless so we'd have to send the conversation history, which counts against the tokens.
I think the only solution is to summarize each file diff, and then generate a commit from the summaries. But I think that may cost too much (both financially and waiting time for multiple requests).
Open to accepting feature requests for:
- Improving the error message to recommend smaller diffs
- Summarizing each file diff before generating a commit message
My suggestion would be aicommits checks the diff size.
If it's 'too big' it prompts for a commit message instead.
Here is an example of how such a case could be handled.
We can temporarily use gpt-3.5-turbo-16k-0613
to solve this problem, which supports a larger token capacity. However, for a large language model, no matter how many tokens it supports, there will always be an upper limit.
Open to accepting feature requests for:
* Improving the error message to recommend smaller diffs * Summarizing each file diff before generating a commit message
I think this will be a good idea.
Also, putting Map-Reduce Summary Strategy Here. This method has no limitations.