Confused how are you displaying accuracy_score for LogisticRegression / DecisionTree by just printing the logs even when in the code you are storing it into a file
rituraj17 opened this issue · comments
Hi @FernandoLpz ,
First of all Awesome Tutorial and blog.!!!
I am currently new to kubeflow pipelines. So wanted to know how are you storing accuracy_score for LogisticRegression /DecisionTree in logs as even though in the code you are storing it into a file.
File path : decision_tree/decision_tree.py
# Get accuracy
accuracy = accuracy_score(y_test, y_pred)
# Save output into file
with open(args.accuracy, 'w') as accuracy_file:
accuracy_file.write(str(accuracy))
File path :pipeline.py
show_results(decision_tree_task.output, logistic_regression_task.output)
I know that you are printing the output by using the show_results() function.
But before this step how are you getting the "decision_tree_task.output "value as it should be a file right?
Shouldn’t we read the file and then print the output?
Hi @FernandoLpz ,
First of all Awesome Tutorial and blog.!!!
I am currently new to kubeflow pipelines. So wanted to know how are you storing accuracy_score for LogisticRegression /DecisionTree in logs as even though in the code you are storing it into a file.
File path : decision_tree/decision_tree.py
# Get accuracy accuracy = accuracy_score(y_test, y_pred) # Save output into file with open(args.accuracy, 'w') as accuracy_file: accuracy_file.write(str(accuracy))
File path :pipeline.py
show_results(decision_tree_task.output, logistic_regression_task.output)
I know that you are printing the output by using the show_results() function.
But before this step how are you getting the "decision_tree_task.output "value as it should be a file right?
Shouldn’t we read the file and then print the output?
Hi @rituraj17 ,
When a component has a single output value (in this case decision_tree_task
only has accuracy
as its output value), the value is saved as a "string", "float", etc. as the case may be. It is for the reason that I do not need to read the file and I only extend the "output" attribute, just like: decision_tree_task.output
.
In case you have multiple outputs, the "output" attribute would be a dict
where "key" would be the name of the variable and "value" the value. For example: decision_tree_task.output['accuracy']
, decision_tree_task.output['precision']
, etc.
It is important to mention that the output attribute can have different types of data, this specification is made in the component's yaml manifest. For example, for decision_tree () the accuracy is read as a float
, not as a file:
outputs:- {name: Accuracy, type: Float, description: 'Accuracy metric'}
Also, it is important to note that within the decision_tree.py script the accuracy metric is stored in a file, however the specification in the manifest says that it will be implemented as a float.
Let me know if you have any other doubt! 🙂