rasbt / python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUESTION] in ch3.py line 64

naty55 opened this issue · comments

commented

in line 64 we use this line to calculate the number of misclassified samples print('Misclassified samples: %d' % (y_test != y_pred).sum())

I think that in some cases it might not give the correct number, consider
y_test = [2, 2, 2] meaning the true labels are 2
y_perd = [0, 0, 0] meaning the predicted labels are 0

The output will be 6 ! even though only 3 samples were misclassified !

The expression (y_test != y_pred) creates the boolean array [True, True, True] whose sum is 3.

Rasbt is a bit tricky but totally pythonic ;)

Thanks for the response, Richard. Your explanation is spot on. In addition, it's worth noting that in Python Boolean values behave like integers 0 (False) and 1 (True) in mathematical operations like the sum. E.g.,

In [1]: sum([True, False, True])
Out[1]: 2
commented

Ok cool i got it, thank you all :)