theta[0] is caculated twice in per gradient_step at LinearRegression
alqbib opened this issue · comments
Ryan Luo commented
Vectorized version of gradient descent.
theta = theta * reg_param - alpha * (1 / num_examples) * (delta.T @ self.data).T
We should NOT regularize the parameter theta_zero.
theta[0] = theta[0] - alpha * (1 / num_examples) * (self.data[:, 0].T @ delta).T
the first code line ,theta include theta[0].
so I think can write like this:
theta[0] -= alpha * (1 / num_examples) * (self.data[:, 0].T @ delta)
theta[1:] = theta[1:] * reg_param - alpha * (1 / num_examples) * (self.data[:, 1:].T @ delta)
thanks