Bug in Non-Linear SVM Kernel Calculation

Question

Bug in Non-Linear SVM Kernel Calculation

fbcotter opened this issue 7 years ago · comments

I think there's a bug in the calculation of the Gaussian Kernel for the multiclass SVM. Your code is as follows:

# Gaussian (RBF) kernel
gamma = tf.constant(-10.0)
dist = tf.reduce_sum(tf.square(x_data), 1)
dist = tf.reshape(dist, [-1,1])
sq_dists = tf.multiply(2., tf.matmul(x_data, tf.transpose(x_data)))
my_kernel = tf.exp(tf.multiply(gamma, tf.abs(sq_dists)))

Initially it bothered me that the variable dist was calculated but never used. I then tried to figure out what you were doing in this code, and I think it is supposed to be doing something like this (dist is K in the below example)

In which case, the code should read something more like:

# Gaussian (RBF) kernel
gamma = tf.constant(-10.0)
K = tf.reduce_sum(tf.square(x_data), 1)
K_cols = tf.expand_dims(K, 1)
K_rows = tf.expand_dims(K, 0)

D = K_cols + K_rows - 2*tf.matmul(x_data, tf.transpose(x_data))
my_kernel = tf.exp(tf.multiply(gamma, D))

(The sq_dists calculation uses broadcasting so we don't need to create the J matrix as in the math above)

Nick · Answer 1 · Thu Mar 22 2018 04:34:25 GMT+0800 (China Standard Time)

HI @fbcotter , Thanks for this! I'm just getting around to triaging issues in preparation for a v2-update of the code.

I suspect you are correct. When I get around to chapter 4, I'll investigate and most likely accept your proposal. Expect more in the next month or so. Thanks again!

Nick · Answer 2 · Mon Apr 09 2018 09:27:54 GMT+0800 (China Standard Time)

Yes, thanks. I just looked into that and found you are correct. I've implemented a very similar fix and kept the 'dist' variable terminology in both the *.py and the jupyter notebook. Thanks again!