Gower distance tends to fit global model

Question

Gower distance tends to fit global model

pkopper opened this issue 5 years ago · comments

I experienced a practical problem when using Gower's distance as a dissimilarity measure. In many different settings, the resulting model was very global. This is - to my mind - because when using Gower's distance we do not work with a kernel function. This can be argued to be meaningful as Gower's distance 'scales' the resulting dissimilarities already. However, practically I have observed in some settings that the resulting dissimilarities are not very discriminating so that we end up with an explainer which is not that different from a global explainer.
Did someone else make similar observations and does someone have a good solution for working with mixed data?
I found the distance measure (eq. 9) from the article below very helpful in different contexts. However, the use of it would require to change to package code.
http://www.cs.ust.hk/~qyang/Teaching/537/Papers/huang98extensions.pdf

Thomas Lin Pedersen · Answer 1 · Tue Jun 11 2019 16:57:40 GMT+0800 (China Standard Time)

It might make sense to allow a kernel on top of the Gower distance as well...

Philipp Kopper · Answer 2 · Tue Jun 11 2019 23:29:41 GMT+0800 (China Standard Time)

If you want to I can suggest a fix via pull request.

Thomas Lin Pedersen · Answer 3 · Wed Jun 12 2019 00:38:12 GMT+0800 (China Standard Time)

Thank you — that would be welcome