AssemblyAI-Community / Machine-Learning-From-Scratch

Implementation of popular ML algorithms from scratch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DecisionTree code - suggestion

sabn0 opened this issue · comments

Hi,
In the DecisionTree implementation, line 33 there is a call to the _most_common_label() method.
Line 33 is reached only if the y numpy array has one unique value, so instead of calling most common
you can simply take the value from any index of the array.

Maybe something like:

# check the stopping criteria
if (depth>=self.max_depth or n_labels==1 or n_samples<self.min_samples_split):
    return Node(value=y[0])

instead of :

# check the stopping criteria
if (depth>=self.max_depth or n_labels==1 or n_samples<self.min_samples_split):
    leaf_value = self._most_common_label(y)
    return Node(value=leaf_value)