A small flaw regarding the deployments table.
frg01 opened this issue · comments
A default mechanism for train deployment is best_score, but why is the strategy field in the deployments table a fixed value most_recent? If it is to record the strategy during deployment, then it is best to be a variable. If it is to query during deployment, then it It seems like another redundant field. Maybe I don’t understand this table well enough. If possible, I’d like to ask you for advice!
`select pgml.train(
project_name => 'Diabetes Regression',
task => 'regression',
relation_name => 'pgml.diabetes',
y_column_name => 'target',
algorithm=> 'linear'
);
select * from pgml.deployments
results :
id | project_id | model_id | strategy | created_at
----+------------+----------+-------------+----------------------------
3 | 2 | 4 | most_recent | 2024-01-03 10:52:12.474621
5 | 2 | 7 | most_recent | 2024-01-03 10:55:03.908885
4 | 2 | 5 | most_recent | 2024-01-03 10:54:20.831898
2 | 2 | 2 | most_recent | 2024-01-03 10:47:02.574409
1 | 1 | 1 | most_recent | 2024-01-03 10:29:18.626135
`
Thank you for your continuous explanations these days. If I find new problems, I may need to trouble you. Thank you more.
This is a bug in https://github.com/postgresml/postgresml/blob/master/pgml-extension/src/orm/project.rs#L99
We should make this function take a parameter with the correct strategy to record in the database.
To address question 4 in #1259 we should also expose this function to deploy by model.id in the api with a new "manual" strategy.
fixed with #1265