Exam 1 Review
Suggested answers
- There are 409 rows in the
blizzard_salary
dataset. Each row represents a Blizzard Entertainment worker who filled out the spreadsheet. - a - Figure 1 - A shared x-axis makes it easier to compare summary statistics for the variable on the x-axis.
- c - It’s a value higher than the median for hourly but lower than the mean for salaried.
- b - There is more variability around the mean compared to the hourly distribution.
- b - Pie charts and waffle charts are for categorical data only.
- c - For every additional $1,000 of annual salary, the model predicts the raise to be higher, on average, by 0.0155%.
- d - \(R^2\) of
raise_2_fit
is higher than \(R^2\) ofraise_1_fit
sinceraise_2_fit
has one more predictor and \(R^2\) always goes up with the addition of a predictor. - The reference level of
performance_rating
is High, since it’s the first level alphabetically. Therefore, the coefficient -2.40% is the predicted difference in raise comparing High to Successful. In this context a negative coefficient makes sense since we would expect those with High performance rating to get higher raises than those with Successful performance. - a - “Poor”, “Successful”, “High”, “Top”
- Choose Option 2 since it shows the proportions of employees with top, high, successful, and poor performance within each salary type, and is not affected by there being much fewer hourly paid employees. Proportions of employees with top and successful performance ratings are higher for employees paid hourly than salaried.
- There may be some
NA
s in these two variables that are not visible in the plot. - The proportions under Hourly would go in the Hourly bar, and those under Salaried would go in the Salaried bar.
- This is a mosaic plot. It shows the marginal distribution of salary type (proportion of hourly and salaried employees), which is not displayed in the previous plot.
- c - Option 3. Parallel lines and salaried line has a higher intercept since Hourly is the reference level in
raise_3_fit
and the slope forsalary_typeSalaried
is positive. - A parsimonious model is the simplest model with the best predictive performance.