niloc3 / stat-notes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

He just went over a couple of questions. If theres one ur confused by you can ask me and I can try to help :)

If you feel good about the practice quizzes you should be good for the quiz. Make sure you get BOTH because they cover different parts.

Ones most people missed on practice #1:

Question 4:

Which is true?

I. Random scatter in the residuals indicates a model with high predictive power.

- FALSE because this is indicating a linear model is appropriate not anything about the strenth

II. If two variables are very strongly associated, then the correlation between them will be near +1.0 or -1.0.

- FALSE: Highlighting difference between accociation and correlation, strong accociation could be anything, could be a strong accociation for a parabola or log or whatever but it would have a low correlation because it is not LINEAR.

III. The higher the correlation between two variables the more likely the association is based in cause and effect.

- FALSE: correlation is not causation

Question 5:

Two variables that are actually not related to each other may nonetheless have a very high correlation because they both result from some other, possibly hidden, factor. This is an example of a. an outlier. b. leverage. c. a lurking variable. d. extrapolation. e. regression.

While most didn't miss this, there will most likely be a question like this on the test but with a different answer, so make sure to know all of the vocab.

Question 6:

If the point in the upper right corner of this scatterplot is removed from the data set, then what will happen to the slope of the line of best fit (b) and to the correlation (r) ?

b will decrease, and r will increase. The point is pulling the slope up and when it is removed then it will come down therefor decreasing. The r value will then get stronger because there is one less point with a large residual

Know the assumptions and what conditions need to be met for each of them (Question 10 type question)

You will need to write the equation. DONT FORGET "PREDICTED" OR THE LITTLE HAT ON TOP

Anything past Q10 on the practice expect to be a write out or fill in the blank.

To find r from r^2 just take square root and then look at the slope to find if its positive or negative.

Ones people missed on practice #2

Question 4:

The correlation coefficient between the hours that a person is awake during a 24-hour period and the hours that same person is asleep during a 24-hour period is most likely to be

exactly -1.0

The two are connected so if you are awake for 24 you are asleep for 0, there is no option for any of the points to be off the the line they fit the model perfectly and there is no third option, you are either asleep or awake for each hour in 24 hours.

Question 5:

The correlation coefficient between high school grade point average (GPA) and college GPA is 0.560. For a student with a high school GPA that is 2.5 standard deviations above the mean, we would expect that student to have a college GPA that is _____ the mean.

Regression to the mean, the stronger the correlation the closer to the original value. So the stronger the r value the closer it will be to 2.5 SD but the weaker, the closer it will be to the mean. 0.56(2.5) = 1.4 therfor 1.4 SD above is correct

Question 7:

When using midterm exam scores to predict a student’s final grade in a class, the student would prefer to have a You want a positive residual because the equation for residuals is: actual value - predicted value and you want your actual value to be larger. You want to estimate better.

About