Table of Contents

Table of Contents
Content Page
Overview and Philosophy 8
Scope and Sequence 14
UNIT 1
Campaign Topics
Daily Overview 19
Essential Concepts 20
Section 1: Data are all Around 22
Lesson 1: Data Trails Defining data, consumer privacy 24
Lesson 2: Stick Figures Organizing & collecting data 26
Lesson 3: Data Structures Organizing data, rows & columns, variables 28
Lesson 4: The Data Cycle Data cycle, statistical investigative questions 30
Lesson 5: So Many Questions Statistical investigative questions, variability 35
Lesson 6: What Do I Eat? Food Habits Data cycle, collecting data 39
Lesson 7: Setting the Stage Food Habits – data Participatory Sensing 42
Section 2: Visualizing Data 47
Lesson 8: Tangible Plots Food Habits – data Dotplots, minimum/maximum, frequency 48
Lesson 9: What Is Typical? Food Habits – data Typical value, center 52
Lesson 10: Making Histograms Food Habits – data Histograms, bin widths 55
Lesson 11: What Shape Are You In? Food Habits – data Shape, center, spread 58
Lesson 12: Exploring Food Habits Food Habits – data Single & multi-variable plots 60
Lesson 13: RStudio Basics Food Habits – data Intro to RStudio 62
Lab 1A: Data, Code & RStudio Food Habits – data RStudio basics 65
Lab 1B: Get the Picture? Food Habits – data Variable types, bar graphs, histograms 68
Lab 1C: Export, Upload, Import Importing data 71
Lesson 14: Variables, Variables, Variables Multi-variable plots 75
Lab 1D: Zooming Through Data Subsetting 79
Lab 1E: What’s the Relationship? Multi-variable plots 83
Practicum: The Data Cycle & My Food Habits Food Habits Data cycle, variability 86
Section 3: Would You Look at the Time 88
Lesson 15: Americans’ Time on Task Time Use – data Evaluating claims 89
Lab 1F: A Diamond In the Rough Time Use – data Cleaning names, categories, and strings 94
Lesson 16: Categorical Associations Time Use – data Joint relative frequencies in 2- way tables 99
Lesson 17: Interpreting Two-Way Tables Time Use – data Marginal & conditional relative frequencies 101
Lab 1G: What’s the FREQ? Time Use – data 2-way tables, tally 106
Practicum: Teen Depression Time Use Statistical investigative questions, interpreting plots 109
Lab 1H: Our Time Data cycle, synthesis 111
End of Unit Project: Analyzing Data to Evaluate Claims Data cycle 112
UNIT 2 Campaign Topics
Daily Overview 114
Essential Concepts 115
Section 1: What is Your True Color? 117
Lesson 1: What Is Your True Color? Personality Color - data Subsets, relative frequency 119
Lesson 2: What Does Mean Mean? Personality Color Measures of center – mean 122
Lesson 3: Median In the Middle Personality Color Measures of center – median 126
Lesson 4: How Far Is It from Typical? Personality Color Measures of spread – MAD 129
Lab 2A: All About Distributions Personality Color Measures of center & spread 132
Lesson 5: Human Boxplots Boxplots, IQR 134
Lesson 6: Face Off Comparing distributions 137
Lesson 7: Plot Match Comparing distributions 140
Lab 2B: Oh, the Summaries… Personality Color Numerical summaries, custom functions 143
Practicum: The Summaries Food Habits or Time Use Data cycle, comparing distributions 146
Section 2: How Likely is it? 148
Lesson 8: How Likely is It? Probability, simulations 149
Lesson 9: Bias Detective Simulations to detect unfairness 152
Lesson 10: Marbles, Marbles Probability, with replacement 156
Lab 2C: Which Song Plays Next? Probability of simple events, do loops, set.seed() 158
Lesson 11: This AND/OR That Compound probabilities 161
Lab 2D: Queue It Up! Probability with/without replacement, sample() 165
Practicum: Win, Win, Win Probability estimation through repeated simulations 168
Section 3: Are You Stressing or Chilling? 169
Lesson 12: Don’t Take My Stress Away Stress/Chill – data Introduction to campaign 171
Lesson 13: The Horror Movie Shuffle Stress/Chill – data Chance differences – categorical 175
Lab 2E: The Horror Movie Shuffle Stress/Chill – data Inference for categorical variables 179
Lesson 14: The Titanic Shuffle Stress/Chill – data Chance differences - numerical 182
Lab 2F: The Titanic Shuffle Stress/Chill – data Inference for numerical variables 186
Lesson 15: Tangible Data Merging Stress/Chill – data Merging datasets 188
Lab 2G: Getting It Together Stress/Chill & Personality Color Stacking vs. joining datasets 190
Practicum:What Stresses Us? Stress/Chill & Personality Color Analyzing merged data 192
Section 4: What’s Normal? 193
Lesson 16: What Is Normal? Introduction to normal curve 194
Lesson 17: A Normal Measure of Spread Measures of spread - SD 197
Lesson 18: What’s Your Z-Score? z-scores, shuffling 200
Lab 2H: Eyeballing Normal Normal curves overlaid on distributions & simulated data 204
Lab 2I: R’s Normal Distribution Alphabet Normal probability, rnorm(), pnorm(), qnorm() 206
End of Unit Project & Presentation: Asking and Answering Statistical Investigative Questions of Our Data Stress/Chill, Personality Color, FoodHabits, or Time Use Synthesis of Unit 2 208
UNIT 3 Campaign Topics
Daily Overview 210
Essential Concepts 211
Section 1: Testing, Testing…1, 2, 3… 213
Lesson 1: Anecdotes vs. Data Reading articles critically, data 215
Lesson 2: What is an Experiment? Experiments, causation 218
Lesson 3: Let’s Try an Experiment! Random assignments, confounding factors 221
Lesson 4: Predictions, Predictions Visualizations, predictions 223
Lesson 5: Time Perception Experiment Elements of an experiment 225
Lab 3A: The Results Are In! Analyzing experiment data 226
Practicum: Music to my Ears Design an experiment 227
Section 2: Would You Look at That? 228
Lesson 6: Observational Studies Observational study 230
Lesson 7: Observational Studies vs. Experiments Observational study, experiment 232
Lesson 8: Monsters that Hide in Observational Studies Observational study, confounding factors 234
Lab 3B: Confound it all! Confounding factors 238
Section 3: Are You Asking Me? 240
Lesson 9: Survey Says… Survey 241
Lesson 10: We’re So Random Data collection, random samples 244
Lesson 11: The Gettysburg Address Sampling bias 248
Lab 3C: Random Sampling Random sampling 253
Lesson 12: Bias in Survey Sampling Bias in survey sampling 255
Lesson 13: The Confidence Game Confidence intervals 258
Lesson 14: How Confident Are You? Confidence intervals, margin of error 261
Lab 3D: Are You Sure about That? Bootstrapping 263
Practicum: Let’s Build a Survey! Survey design with non-leading questions 266
Section 4: What’s the Trigger? 267
Lesson 15 Ready, Sense, Go! Sensors, data collection 268
Lesson 16: Does it have a Trigger? Survey questions, sensor questions 271
Lesson 17: Creating Our Own PS Campaign Participatory Sensing campaign creation 273
Lesson 18: Evaluating Our Own PS Campaign Statistical investigative questions, evaluate campaign 276
Lesson 19: Implementing Our Own PS Campaign Class Campaign—data Mock-implement & create campaign 278
Section 5: Webpages 280
Lesson 20: Online Data-ing Class Campaign—data Data on the internet 281
Lab 3E: Scraping Web Data Class Campaign—data Scraping data from the Internet 284
Lab 3F: Maps Class Campaign—data Making maps with data from the Internet 286
Lesson 21: Learning to Love XML Class Campaign—data Data storage, XML 288
Lesson 22: Changing Format Class Campaign—data Converting XML files 293
Practicum: What Does Our Campaign Data Say? Class Campaign Statistical investigative questions, our data 296
End of Unit Project: TB or Not TB Class Campaign Simulation using experiment data 297
UNIT 4 Campaign Topics
Daily Overview 300
Essential Concepts 301
Section 1: Campaigns and Community 303
Lesson 1: Trash Modeling to answer real world problems, official datasets 304
Lesson 2: Drought Exploratory data analysis, campaign creation 307
Lesson 3: Community Connection Team Campaign—data Community topic research, campaign creation 309
Lesson 4: Evaluate and Implement the Campaign Team Campaign—data Evaluate & mock-implement campaign 312
Lesson 5: Refine and Create the Campaign Team Campaign—data Revise and edit campaign, data collection 314
Section 2: Predictions and Models 315
Lesson 6: Statistical Predictions Using One Variable Team Campaign—data One variable predictions using a rule 317
Lesson 7: Statistical Predictions Applying the Rule Team Campaign—data Predictions applying MSE, MAE 319
Lesson 8: Statistical Predictions Using Two Variables Team Campaign—data Two-variable statistical predictions 323
Lesson 9: The Spaghetti Line Team Campaign—data Estimate line of best fit, linear regression 326
LAB 4A: If the Line Fits… Team Campaign—data Estimate line of best fit 328
Lesson 10: What’s the Best Line? Team Campaign—data Predictions based on linear models 330
LAB 4B: What’s the Score? Team Campaign—data Comparing predictions to real data 333
LAB 4C: Cross-Validation Team Campaign—data Use training and test data for predictions 335
Lesson 11: What’s the Trend? Team Campaign—data Trend, associations, linear model 338
Lesson 12: How Strong Is It? Team Campaign—data Correlation coefficient, strength of trend 342
LAB 4D: Interpreting Correlations Team Campaign—data Correlation coefficient, best model 345
Lesson 13: Improving Your Model Team Campaign—data Non-linear regression 348
LAB 4E: Some Models Have Curves Team Campaign—data Non-linear regression 350
Practicum: Predictions Team Campaign—data Linear regression 352
Section 3: Piecing it Together 353
Lesson 14: More Variables to Make Better Predictions Team Campaign—data Multiple linear regression 354
Lesson 15: Combination of Variables Team Campaign—data Multiple linear regression 357
LAB 4F: This Model Is Big Enough for All of Us Team Campaign—data Multiple linear regression 360
Section 4: Decisions, Decisions! 361
Lesson 16: Footbal or Futbol? Team Campaign—data Multiple predictors, classifying into groups 362
Lesson 17: Grow Your Own Decision Tree Team Campaign—data Decision trees based on training/test data 368
LAB 4G: Growing Trees Team Campaign—data Decision trees to classify observations 372
Section 5: Ties That Bind 375
Lesson 18: Where Do I Belong? Team Campaign—data Clustering, k-means 376
LAB 4H: Finding Clusters Team Campaign—data Clustering, k-means 382
Lesson 19: Our Class Network Team Campaign—data Clustering, networks 384
End of Unit Modeling Activity Project
Team Campaign Synthesis of Unit 4 387