Lecture 7
Duke University
STA 101 - Fall 2023
2023-09-25 Check-in
___
(released in class) \(R^2\), aka “the coefficient of determination” or “correlation squared” is a way to see how well a given model fits the data.
\[ R^2 = r_i^2 \]
The sum of squares total is a measure of the total variability in the outcome variable:
\[ SST = (y_1 - \bar{y})^2 + (y_2 - \bar{y})^2 + \cdots + (y_n - \bar{y})^2 \]
The sum of squares residuals (error) is a measure of the variability in the residuals, i.e., variability left unexplained in the outcome variable after the model is fit:
\[ SSE = (y_1 - \hat{y}_1)^2 + (y_2 - \hat{y}_2)^2 + \cdots + (y_n - \hat{y}_n)^2 \]
\[ R^2 = \frac{SST - SSE}{SST} = 1 - \frac{SSE}{SST} \]
If the sum of squared residuals is 0, then the model explains all variability and \(R^2 = 1 - 0 = 1\).
If the sum of squared residuals is the same as all the variability in the data, then model does not explain any variability and \(R^2 = 1 - 1 = 0\).
\(R^2\) is a measure of the proportion of variability the model explains. An \(R^2\) of 0 is a poor fit and \(R^2\) of 1 is a perfect fit.
To find \(R^2\) simply call the function glance()
on your model_fit
, e.g.
\[ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \epsilon \]
Let’s examine the first quarter of 2020 high prices of Microsoft, IBM, and Apple stocks to illustrate some ideas.
If we have three measurements (variables) then each observation is a point in three-dimensional space. In this example, we can choose one of our measurements to be the outcome variable (e.g. Apple stock price) and use our other two measurements (MSFT and IBM price) as predictors.
In general, the total number of measurements, i.e. variables (columns) in our linear model represents the spatial dimension of our model.
The fitted linear model no longer looks like a line, but instead looks like a plane. It shows our prediction of AAPL price (\(y\)) given both MSFT price (\(x_1\)) and IBM price (\(x_2\)).
Find the equation of the plane by adding in new predictors:
tidy()
function on your fitted model: tidy(my_model_fit)
Go to Posit Cloud and start the project titled ae-07-Books.
ICYMI
Today’s daily check-in access code: ___
(released in class)