Sit with someone new today!
Every week in MSCS
This week in class
Today we’ll practice discussing “insights” we gain from our visualizations. Then, we create some visuals by hand!
To go beyond 2 variables, we need to add aesthetics for each new variable!
Though far from a perfect assessment of academic preparedness, SAT scores have historically been used as one measurement of a state’s education system.
State | expend | ratio | salary | frac | verbal | math | sat | fracCat |
---|---|---|---|---|---|---|---|---|
Alabama | 4.405 | 17.2 | 31.144 | 8 | 491 | 538 | 1029 | (0,15] |
Alaska | 8.963 | 17.6 | 47.951 | 47 | 445 | 489 | 934 | (45,100] |
Arizona | 4.778 | 19.3 | 32.175 | 27 | 448 | 496 | 944 | (15,45] |
Arkansas | 4.459 | 17.1 | 28.934 | 6 | 482 | 523 | 1005 | (0,15] |
California | 4.992 | 24.0 | 41.078 | 45 | 417 | 485 | 902 | (15,45] |
Colorado | 5.443 | 18.4 | 34.571 | 29 | 462 | 518 | 980 | (15,45] |
Variability in average SAT scores from state to state:
What degree do per pupil spending (expend
) and teacher salary
explain this variability?
ggplot(education, aes(y = sat, x = salary)) +
geom_point() +
geom_smooth(se = FALSE, method = "lm") + theme_classic() +
theme(text = element_text(size=20))
ggplot(education, aes(y = sat, x = expend)) +
geom_point() +
geom_smooth(se = FALSE, method = "lm") + theme_classic() +
theme(text = element_text(size=20))
Is there anything that surprises you in the above plots? What are the relationship trends? Discuss as a group.
Let’s make a single scatterplot visualization that demonstrates the relationship between sat
, salary
, and expend
.
Thoughts:
1. We could use the color or size aesthetics to incorporate the expenditure data.
2. Include some model smooths with geom_smooth()
to help highlight the trends.
Another option!
Categorize your 3rd Quantitative Variable!
The fracCat
variable in the education
data categorizes the fraction of the state’s students that take the SAT into low
(below 15%), medium
(15-45%), and high
(at least 45%).
fracCat
variable to better understand how many states fall into each category.fracCat
and sat
. What story does your graphic tell?fracCat
, sat
, and expend
. Incorporate fracCat
as the color of each point, and use a single call to geom_smooth
to add three trendlines (one for each fracCat
). What story does your graphic tell?Discuss!
After class, I want you to look through the heat maps and star plots. I want you to reflect on the insight you gain from the different plots.
Let’s go to Google Doc for the instructions.
Your task: Create a visualizations based on the data provided with any materials available.
Name | Area (acres) | Max depth (feet) |
Watershed area (acres) |
Chain of lakes | Longitude | Latitude | City |
Bde Maka Ska | 401 | 87 | 2992 | Yes | -93.311883 | 44.941966 | Minneapolis |
Lake Harriet | 335 | 85 | 1139 | Yes | -93.304514 | 44.921725 | Minneapolis |
Lake Nokomis | 204 | 33 | 869 | No | -93.241582 | 44.908678 | Minneapolis |
Cedar Lake | 170 | 51 | 1956 | Yes | -93.321751 | 44.959361 | Minneapolis |
Lake of the Isles | 109 | 31 | 735 | Yes | -93.306507 | 44.955482 | Minneapolis |
Lake Hiawatha | 54 | 33 | 1734 | No | -93.236044 | 44.920906 | Minneapolis |
Lake Como | 71 | 15 | 1783 | No | -93.140153 | 44.979637 | St Paul |
Lake Phalen | 198 | 91 | 14720 | No | -93.053102 | 44.986744 | St Paul |