Lab 8 - Wrap it up!

By the end of this lab you will

calculate probabilities under the normal distribution
construct and interpret confidence intervals via bootstrapping

For all visualizations you create, be sure to include informative titles for the plot, axes, and legend!

Getting started

Go to Posit Cloud and start the project titled lab-7 - Inference for proportions and means.
Under the Files tab on the lower right, click on lab-7.qmd to open the lab template.
Complete the exercises in this document.

Packages

We’ll use the following packages in this lab:

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(tidymodels)

── Attaching packages ────────────────────────────────────── tidymodels 1.1.1 ──
✔ broom        1.0.5          ✔ rsample      1.2.0     
✔ dials        1.2.0          ✔ tune         1.1.2     
✔ infer        1.0.5.9000     ✔ workflows    1.1.3     
✔ modeldata    1.2.0          ✔ workflowsets 1.0.1     
✔ parsnip      1.1.1          ✔ yardstick    1.2.0     
✔ recipes      1.0.8          
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
• Use tidymodels_prefer() to resolve common conflicts.

library(dsbox)


Attaching package: 'dsbox'

The following object is masked from 'package:infer':

    gss

Part 1 - Statistical experience

The goal of these exercises is to experience the world of statistics outside the classroom.

Exercise 1

For this exercise you can do one of the following:

Attend a talk related to statistics or data science. To count, the talk must be at least 30 minutes or longer.
Talk with someone who uses statistics in their daily work. This could include a professor, professional in industry, graduate student, etc.
Listen to a statistics podcast (e.g., check out Causal Inference podcast) or watch a conference talk (e.g., a talk from rstudio::conf(2022))

Next, write 4-6 sentences summarizing your experience. Include the following:

Citation or link to web page for event if applicable.
Name/brief description of the event/podcast/chat.
Something you found new, interesting, or unexpected.
How the event connects to something we’ve done in class.

Exercise 2

For this exercise you’re asked to read a “winning” class project from a student. Read one of the winning projects from the the Undergraduate Class Project Competition (USCLAP). You can find the winners at https://www.causeweb.org/usproc/projects/winners.

Next, write 4-6 sentences summarizing the project you read. Include the following:

Link to project.
Title and research question.
Description of dataset used.
Brief summary of findings.
Something you found new, interesting, or unexpected.

Part 2: Second chances

Exercise 3

A second chance for an exercise: Take one of the exercises from a previous homework assignment, ideally one you received more feedback on / lost more points on and improve it. First, write one sentence reminding us of the specific exercise and a a few sentences on why you chose this specific exercise to improve and how you plan to improve it. Some of these improvements might be “fixes” for things that were pointed out as missing or incorrect. Some of them might be things you hoped to do before the homework deadline, but didn’t get a chance to. Some notes for completing this exercise:

You will need to add your data from your the homework assignment to the data/ folder in this assignment. You do not need to also add a data dictionary.
You will need to copy over any code needed for cleaning / preparing your data for this specific exercise. You can reuse code from your previous assignment but note that we will re-evaluate your code as part of the grading for this exercise. This means we might catch something wrong with it that we didn’t catch before, so if you spot any errors make sure to fix them.
Don’t worry about being critical of your own work. Even if you lost no points on the plot, if you think it can be improved, articulate how / why. We will not go back and penalize for any mistakes you might point out that we didn’t catch at the time of grading your homework assignment. There’s no risk to being critical!

Exercise 4

A second chance for a Project 1 plot: TL;DR - Same as the previous exercise, but for a plot from Project 1.

Take one of the visualizations from your first project, ideally one you received more feedback on / lost more points on and improve it. First, write one sentence reminding us of your project and a a few sentences on why you chose this plot and how you plan to improve it. Some of these improvements might be “fixes” for things that were pointed out as missing or incorrect. Some of them might be things you hoped to do before the project deadline, but didn’t get a chance to. And some might be things you wanted to do in your project but your teammates didn’t agree so you gave up on at the time. Some notes for completing this exercise:

You will need to add your data from your project to the data/ folder in this assignment. You do not need to also add the data dictionary.
You will need to copy over any code needed for cleaning / preparing your data for this plot. You can reuse code from your project but note that we will re-evaluate your code as part of the grading for this exercise. This means we might catch something wrong with it that we didn’t catch before, so if you spot any errors make sure to fix them.
Don’t worry about being critical of your own work. Even if you lost no points on the plot, if you think it can be improved, articulate how / why. We will not go back and penalize for any mistakes you might point out that we didn’t catch at the time of grading your project. There’s no risk to being critical!

Part 3: IMS exercises

Exercise 5

IMS - Chapter 19 exercises, #14: Interpreting p-values for population mean.

Exercise 6

IMS - Chapter 19 exercises, #20: Possible bootstrap samples.

Exercise 7

IMS - Chapter 20 exercises, #4: Lizards running, randomization test.

Exercise 8

IMS - Chapter 20 exercises, #6: Lizards running, bootstrap interval.

Exercise 9

IMS - Chapter 20 exercises, #10: A/B testing.

Exercise 10

IMS - Chapter 21 exercises, #6: High School and Beyond, randomization test.