vidalt / BA-Trees

Born-Again Tree Ensembles: Transforms a random forest into a single, minimal-size, tree with exactly the same prediction function in the entire feature space (ICML 2020).

Home Page:https://arxiv.org/pdf/2003.11132.pdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Solving possible "Can't set attribute" error in build_classifier from persistence.py

crimson-luis opened this issue · comments

commented

I applied your method to the STULONG atherosclerosis dataset. However, when I was trying to run illustrative_example.ipynb I got the following error on the build_classifier function:

Traceback (most recent call last):
File "...\BA-Trees\venv\lib\site-packages\IPython\core\interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 68, in
current_fold, n_trees, return_file=True)
File "...\BA-Trees\src\random_forests.py", line 85, in load
clf = pr.classifier_from_file(filename, X, y, pruning=True, num_trees=n_trees)
File "...\BA-Trees\src\persistence.py", line 320, in classifier_from_file
return build_classifier(trees)
File "...\BA-Trees\src\persistence.py", line 289, in build_classifier
clf.n_features_ = trees[0].n_features
AttributeError: can't set attribute

So, after a bit of investigation I found that the lines 279 and 289 were using a deprecated attribute name, "n_features_".
The solution is to change both to "n_features_in_", like below.

def build_classifier(trees):

  def build_decision_tree(t):
      dt = DecisionTreeClassifier(random_state=0)
      dt.n_features_in_ = t.n_features
      dt.n_outputs_ = t.n_outputs
      dt.n_classes_ = t.n_classes[0]
      dt.classes_ = np.array([x for x in range(dt.n_classes_)])
      dt.tree_ = t
      return dt
  
  if len(trees) > 1:
      clf = RandomForestClassifier(random_state=0, n_estimators=len(trees))
      clf.estimators_ = [build_decision_tree(t) for t in trees]
      clf.n_features_in_ = trees[0].n_features
      clf.n_outputs_ = trees[0].n_outputs
      clf.n_classes_ = trees[0].n_classes[0]
      clf.classes_ = np.array([x for x in range(clf.n_classes_)])
  else:
      clf = build_decision_tree(trees[0])
  return clf