joshpxyne / gpt-migrate

Easily migrate your codebase from one framework or language to another.

Home Page:https://gpt-migrate.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Breaking down large files into smaller chunks based on context window size

joshpxyne opened this issue · comments

Breaking down large files into smaller chunks based on context window size

@0xpayne This is a highly important fix. When will it be available pls?

commented

IMO, this is quite a dangerous thing. At least from some experiments using the regular GPT webinterface, I found that "carelessly" splitting a larger file can lead to vastly crappy results when some code relies on previous functions / definitions / variables.

@Sineos I totally agree. GPT can't create/handle logic, even more if the code is broke down to chunks. The code quality is correlated with the dependency between variables, functions, libraries, classes, etc...
The only way I see this working (not perfectly) is if we can push the entire codebase as input, and that probably requiers a 1 million token model.

This problem can be partial solved with AST tree.

I've been (slowly) working on a solution for this where I've abstracted away the separately "compilable" parts. My initial aim was to use it for a project like this that I was going to write. But it seems like adding it to this project would be more worthwhile.

source splitter

@doyled-it

So does mine, but yours looks way more advanced.

It looks like you're trying to do the same thing as this project?

It looks like you're trying to do the same thing as this project?

It has some differences. We're trying to focus on other aspects of modernization that aren't just direct translation of source files. Although, we still have that functionality.

And we don't have the loop where we run code, get an error, and update the code based on output.

@doyled-it
I was looking at trying to do translation with distributed inference, i.e. through litellm. That way it could be more useful for open source developers since they could run local/free inference endpoints.

Looks like your project could use that too....