Data-Driven-World / d2w_ml_notes

Notes for Data Driven World

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Typo in LinearRegression.ipynb and Clarifications

iangohy opened this issue · comments

In the introduction:

Linear regression is a machine learning algorithm dealing with a continuous data and is considered a supervised machine learning algorithm. Linear regression is a useful tool for predicting a quantitative response. Though it may look just like another statistical methods, linear reguression regression is a good jumping point for newer approaches in machine learning.

In Hypothesis:

Notice that the resale price increases as the floor area increases. So we can make a hypothesis by creating a straight line equation that predict he predicts the resale price given the floor area data. The figure below shows the plot of a straight line and the existing data together.

Note that we use the hat symbol to denote the estimated value for an unknown parameters parameter or coeffcient coefficient or to denote the predicted value. The predicted value is also called a hypothesis.

Gradient Descent

First Problem

The steepest slope can be found from the gradient of the function. Let's look at poing point $x_0$ in the figure.

Second Problem

Image labels axis as beta 1 and beta 2 but formulas use beta 0 and beta 1. For example:

The partial derivative with respect to $\beta_0$ is non-zero while the derivative with respect to $\beta_1$ is zero. So we have the following:

However, the picture shows non zero in direction beta 1 and zero in direction beta 2, making it difficult to comprehend the notes.

Third Problem

image

For formulas such as the ones above, is the notation correct, because it seems to be easily confused as a power rather than the index. Should it be a subscript or have a bracket around the i in the superscript?

@iangohy , thanks for the correction. I have fixed the image as well. Regarding the super script and the sub script, I have not thought of better notation but superscript can be used. I have seen some notes using superscript. The problem is that I have used subscript as the index for the feature. So I need another one for the index of the data point.