jeroenjanssens / scikit-sos

A Python implementation of the Stochastic Outlier Selection algorithm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Large distances silently ignored

thomasp85 opened this issue · comments

When a point is so far away that the exponential to the negative distances returns 0 for all distances, the d2a function silently fails and leave the affinities for that point at zero. This is due to a lack of NaN check of H.

I'm working on an R implementation and what I do is set beta[i] to beta[i]/10 in the case of H == NaN and continue calculations.

Yes, I dealt with that. But in my opinion is better option normalize the input.

Hi @thomasp85 , it's been a while :) It's great to hear that you started working on an R implementation. Did you ever get a chance to finish it? How's that trick you propose working out?

2 years fly by :-)

Yeah I made a working R implementation but never got around to releasing it (it got caught up in dreams of a coherent system for running outlier detection). I should just release it...

As far as I remember the trick did the trick:-)

Thanks @thomasp85 for the suggestion.

By the way, I really like that you ported SOS to R. If you plan to continue with it, then a mention to the original paper would be much appreciated.

Sure - I'll cite all relevant sources. Currently it's not near any publishable state but once I get to finish it you'll get credited properly