Bivariate Visualizations

Brianna Heggeseth

Announcements

  • Almost all of you have feedback on Assignment 1
  • Assignment 2 was due last night
  • Tidy Tuesday 2 (TT2) was posted yesterday (Bigfoot sightings!)


  • I request that masks be worn in my class for 1 more week (until my 2 year old is officially fully vaccinated).
    • If you have concerns about masks going optional in the classroom after next Thursday, email me directly so I can have all the information I need to make a decision that best supports this community.

Learning Goals

  • Identify appropriate types of bivariate visualizations, depending on the type of variables (categorical, quantitative)
  • Create basic bivariate visualizations based on real data

Bivariate Visualizations

In this activity we will analyze data from the 2016 presidential election.

We’ll explore county-level election outcomes and demographics.

Template File

Download a template .Rmd of this activity. Put the file in a Day_04 folder within your COMP_STAT_112 folder.

  • This .Rmd only contains exercises that we’ll work on in class and you’ll finish for Assignment 3.

Getting to know the dataset

Loading in the Data

elect <- read_csv("https://www.macalester.edu/~ajohns24/data/electionDemographics16.csv")

Getting to know the dataset

Check out the first rows of elect. What are the units of observation?

head(elect)
# A tibble: 6 × 34
  county total…¹ dem_2…² gop_2…³ oth_2…⁴ total…⁵ dem_2…⁶ gop_2…⁷ oth_2…⁸ total…⁹
  <chr>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 Walke…   28652    7420   20722     510   28497    6551   21633     313   29243
2 Bullo…    5415    4011    1391      13    5318    4058    1250      10    4701
3 Calho…   49242   16334   32348     560   46240   15500   30272     468   47376
4 Barbo…   11630    5697    5866      67   11459    5873    5539      47   10390
5 Fayet…    7957    1994    5883      80    7912    1803    6034      75    8196
6 Baldw…   81413   19386   61271     756   84988   18329   65772     887   94090
# … with 24 more variables: dem_2016 <dbl>, gop_2016 <dbl>, oth_2016 <dbl>,
#   perdem_2016 <dbl>, perrep_2016 <dbl>, winrep_2016 <lgl>, perdem_2012 <dbl>,
#   perrep_2012 <dbl>, winrep_2012 <lgl>, perdem_2008 <dbl>, perrep_2008 <dbl>,
#   winrep_2008 <lgl>, region <dbl>, total_population <dbl>,
#   percent_white <dbl>, percent_black <dbl>, percent_asian <dbl>,
#   percent_hispanic <dbl>, per_capita_income <dbl>, median_rent <dbl>,
#   median_age <dbl>, polyname <chr>, abb <chr>, StateColor <chr>, and …
# ℹ Use `colnames()` to see all variable names

Getting to know the dataset

How much data do we have?

dim(elect)
[1] 3112   34

Getting to know the dataset

What are the names of the variables?

names(elect)
 [1] "county"            "total_2008"        "dem_2008"         
 [4] "gop_2008"          "oth_2008"          "total_2012"       
 [7] "dem_2012"          "gop_2012"          "oth_2012"         
[10] "total_2016"        "dem_2016"          "gop_2016"         
[13] "oth_2016"          "perdem_2016"       "perrep_2016"      
[16] "winrep_2016"       "perdem_2012"       "perrep_2012"      
[19] "winrep_2012"       "perdem_2008"       "perrep_2008"      
[22] "winrep_2008"       "region"            "total_population" 
[25] "percent_white"     "percent_black"     "percent_asian"    
[28] "percent_hispanic"  "per_capita_income" "median_rent"      
[31] "median_age"        "polyname"          "abb"              
[34] "StateColor"       

Review: Univariate Viz

Categorical Variable: Bar Plot

ggplot(data = elect, aes(x = winrep_2016)) + 
    geom_bar()

Review: Univariate Viz

Quantitative Variable: Histogram or Density plot

ggplot(data = elect, aes(x = perrep_2016)) + 
    geom_histogram()

ggplot(data = elect, aes(x = perrep_2016)) + 
    geom_density()

Preview: Bivariate Viz

Quantitative + Quantitative Variable: Scatterplot

Preview: Bivariate Viz

Quantitative + Categorical Variable: Density Plots, Boxplots, etc.

Preview: Bivariate Viz

Categorical + Categorical Variable: side-by-side, proportion Bar plots, etc.

In Class

Work on the activity, checking in with your mates at your table.

Notice patterns! Feel free to make visualizations more effective as you go along.

After Class

You’ll make sure to complete Exercise 8-17 (4 of them only require running preexisting code) for the Assignment 3 (due next Tues).

For Friday’s class, meet in the Library (Idea Lab for morning, Lib 206 for FYC)!