graph-of-thoughts
(Note that this was published months before the https://github.com/spcl/graph-of-thoughts repo & paper. I don't think they based their work off this repo, but some kind of ack would have been polite)
The following is based on a paper recently hitting arxiv - "Tree of Thoughts" https://arxiv.org/abs/2305.10601
The concept is depth/breadth first search on a tree of chain of thoughts using LLMs.
For this 'graph of thoughts' approach, it is a bit different version of the paper. It is being used to autonomously improve an ML program.
It creates 3 alternative paths, and then chooses the best one and tries to improve that. It loops recursively until ctrl-C.
It starts with a basic sklearn dataset and code and then we ask GTP4 to improve its r2_score. The starting point was the following code, base.py in the repo.
data.pkl is the california housing dataset, stored as 'data.pkl' so as not to clue GPT4 in as to what the optimal alg should be from its training data.
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
import pandas as pd
# Fetch the data
data = pd.read_pickle("data.pkl")
# Split into features (X) and target (y)
X, y = data.data, data.target
# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Instantiate the model
model = LinearRegression()
# Train the model
model.fit(X_train, y_train)
# Make predictions
predictions = model.predict(X_test)
# Compute and display r^2 score
print('r2_score:', r2_score(y_test, predictions))
get_best_model.py is the code to start the recursive loop generating the graph of thoughts.
Here are the results:
Note: these insights are generated by GPT4, see the source files. They get extracted and fed to each prompt as they're discovered -> only the last row on the list had all of the insights minus one in the prompt.
Insight | Initial File | New File | Initial Score | New Score |
---|---|---|---|---|
Changing the model from LinearRegression to Ridge with alpha=1.0 and adding StandardScaler | base.py | base_n0.py | 0.575 | 0.576 |
Changing the model from LinearRegression to Ridge with alpha=1.0, adding StandardScaler, and applying PolynomialFeatures with degree=2 | base.py | base_n1.py | 0.575 | 0.647 |
Changing the model from LinearRegression to Ridge with alpha=10.0, adding StandardScaler, and applying PolynomialFeatures with degree=3 | base.py | base_n2.py | 0.575 | -14.131 |
Changing the model from Ridge with alpha=1.0 to Lasso with alpha=0.1 | base_n1.py | base_n1_n0.py | 0.647 | 0.482 |
Changing the model from Ridge with alpha=1.0 to ElasticNet with alpha=0.1 and l1_ratio=0.5 | base_n1.py | base_n1_n1.py | 0.647 | 0.515 |
Changing the model from Ridge with alpha=1.0 to RidgeCV with automatic alpha selection | base_n1.py | base_n1_n2.py | 0.647 | 0.656 |
Changing the model from Ridge with alpha=1.0 to RidgeCV with automatic alpha selection and using a pipeline for preprocessing | base_n1.py, base_n1_n2.py | base_n1_n2_n0.py | 0.656 | 0.656 |
Changing the model from Ridge with alpha=1.0 to RidgeCV with automatic alpha selection, using a pipeline for preprocessing, and increasing the degree of PolynomialFeatures to 3 | base_n1.py, base_n1_n2_n0.py | base_n1_n2_n1.py | 0.656 | -15.415 |
Changing the degree of PolynomialFeatures from 2 to 3 and using a pipeline for preprocessing | base_n1_n2.py | base_n1_n2_n2.py | 0.656 | -15.415 |
Changing the model from RidgeCV with automatic alpha selection to LassoCV with automatic alpha selection | base_n1_n2_n0.py | base_n1_n2_n0_n0.py | 0.656 | 0.482 |
Changing the model from RidgeCV with automatic alpha selection to RandomForestRegressor with 100 estimators | base_n1_n2_n0.py | base_n1_n2_n0_n1.py | 0.656 | 0.799 |
Changing the model from RidgeCV with automatic alpha selection to GradientBoostingRegressor with n_estimators=200, learning_rate=0.1, and max_depth=2 | base_n1_n2_n0.py | base_n1_n2_n0_n2.py | 0.656 | 0.775 |
Changing the model from RidgeCV with automatic alpha selection to RandomForestRegressor with GridSearchCV for hyperparameter tuning | base_n1_n2_n0.py | base_n1_n2_n0_n1_n0.py | 0.799 | 0.802 |
Changing the model from RandomForestRegressor with 100 estimators to GradientBoostingRegressor with n_estimators=300, learning_rate=0.1, and max_depth=3 | base_n1_n2_n0_n1.py | base_n1_n2_n0_n1_n1.py | 0.799 | 0.817 |
You can find the source for these in the repo.
--
There are a lot of optimisations that you can do here, limited only by your imagination (and the 8k/32k context window). Some ideas are in the paper linked to above, some you'll find on various places where this concept is discussed. Basic ideas include: dupe checks, pruning, backtracking and monte carlo.
Some basic insight tracking was added as per above, which wasn't exactly in the tree of thoughts paper. This also isn't strictly graph like, as the insights carry globaly. GPT4 tokens do start to add up after awhile.
Another idea is appending a set of selected techniques to suggest to GPT4 that it might try. Impedance mismatch is not a problem and these techniques can be mostly reused for any arbitrary ML problem.
--
FAQ
-
Wouldn't it be cheaper and easier to just do X?
Sure, but then why not just make X your baseline. If automl or optuna is your choice, you can start there. Or feed them in as a library of selected techniques.
-
Why did it take so long for GPT4 to try something other than linear models?
I noticed that as well, it's an indication as to the limits of GPT4 reasoning capabilities. Better use of the context window by adding rules of thumb / heuristics would help.
--
You might encounter some folks lower down in the stack that will call this 'prompt hacking', but for their benefit: