martibosch / detectree

Tree detection from aerial imagery in Python

Home Page:https://doi.org/10.21105/joss.02172

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inconsistent returned value from Classifier::classify_img depending on refine parameter

easz opened this issue · comments

The method Classifier::classify_img returns predicted class (0 or 1) from clf.predict(X) for refine=False case.
The class value is probably from the response build_response_from_arr.

y_pred = clf.predict(X).reshape(img_shape)

However, the method returns by default 0 or 255 values as tree or none-tree for refine=True case.

y_pred[g.get_grid_segments(node_ids)] = self.tree_val

Is such inconsistency intended?

Hello @easz,

thank you for bringing this up - I know that the code can be confusing here, but both cases (i.e., refine=False and refine=True, the latter being the default) should return the same, i.e., an array of tree and non-tree values which by default correspond to 255 and 0 as defined in the settings.

The classifier is trained to predict the two classes defined by the tree_val and nontree_val arguments of ClassifierTrainer.__init__, which are forwarded to PixelResponseBuilder.__init__ (see lines 112-114 and lines 210-212 of classifier.py). The default None values for the arguments means that they will be ultimately taken from the settings.

Therefore, in your first snippet (refine=False), the method clf.predict will return an array of the tree_val and nontree_val values provided in ClassifierTrainer.__init__ as explained above. The second snippet uses the tree_val and nontree_val argument values provided in Classifier.__init__, which if left as None (default) will also correspond to the 255 and 0 values from the settings. I now actually realize that these arguments are not necessary in Classifier since they can be retrieved from the clf.classes_ attribute (since clf is an instance of sklearn.ensemble.AdaBoostClassifier). I will update the code accordingly in the next release.

I hope that this addresses your question. I will close this issue as soon as I update the code as described above. Best,
Martí

Hello again,

I have actually realized that as far as I am concerned, there is no deterministic way to know which values of clf.classes_ (where clf is a trained instance of sklearn.ensemble.AdaBoostClassifier) correspond to a tree and non tree pixel. If anyone knows of a way to properly retrieve the tree and non tree class values, I would gladly reopen this and update the Classifier class so that it does not need such information in the initialization. In the meantime, I am closing this issue.

Feel free to reoopen it if needed. Thank you again for using detectree. Best,
Martí