class: center, middle, inverse, title-slide .title[ # POL 304: Using Data to Understand Politics and Society ] .subtitle[ ## Drawing Maps ] .author[ ### Olga Chyzh [www.olgachyzh.com] ] --- ## Dependence in Observational Data - Individuals are nested in social networks + Individual decisions are influenced by their friends. - Provinces are surrounded by other provinces + Provinces mimic one another's policies - Country-level outcomes are often a result of negotiations with other countries: + Economic or environmental policies --- ## Three Mechanisms for Spatial Dependence - Homophily---similarity in outcomes is endogenous, units are similar because they self-select into the same outcome (e.g., partisan geo-sorting) - Common exposure---similarity in outcomes is driven by an exogenous factor that affects nearby units (the effect of earthquakes on housing prices) - Diffusion---nearby units affect each other through learning, imitation, etc (e.g., policy diffusion) --- <img src="./images/communal_violence.png" width="800px" style="display: block; margin: auto;" /> Source: van Weezel S. "On climate and conflict: Precipitation decline and communal conflict in Ethiopia and Kenya." *Journal of Peace Research*. 2019;56(4):514--528. --- <img src="./images/elections.png" width="600px" style="display: block; margin: auto;" /> Source: Chyzh, Olga V. and R. Urbatsch. 2021. "Bean Counters: The Effect of Soy Tariffs on Change in Republican Vote Share Between the 2016 and 2018 Elections."*Journal of Politics* 83 (1): 415--419. --- ## Necessary `R` Packages ```r library(ggplot2) library(tidyverse) library(magrittr) library(maps) library(mapdata) library(mapproj) library(devtools) install_github("ochyzh/classdata") library(classdata) install_github("mccormackandrew/mapcan", build_vignettes = TRUE) library(mapcan) ``` --- ## Map Data ```r canada_map<-mapcan(boundaries = "province", type="standard",territories=TRUE) head(canada_map) ``` ``` ## long lat order hole piece pr_sgc_code group pr_english ## 1 2278204 1261783 1 FALSE 1 10 10.1 Newfoundland and Labrador ## 2 2294272 1251745 2 FALSE 1 10 10.1 Newfoundland and Labrador ## 3 2300435 1250534 3 FALSE 1 10 10.1 Newfoundland and Labrador ## 4 2306656 1243299 4 FALSE 1 10 10.1 Newfoundland and Labrador ## 5 2315548 1241630 5 FALSE 1 10 10.1 Newfoundland and Labrador ## 6 2313319 1237013 6 FALSE 1 10 10.1 Newfoundland and Labrador ## pr_french pr_alpha ## 1 Terre-Neuve-et-Labrador NL ## 2 Terre-Neuve-et-Labrador NL ## 3 Terre-Neuve-et-Labrador NL ## 4 Terre-Neuve-et-Labrador NL ## 5 Terre-Neuve-et-Labrador NL ## 6 Terre-Neuve-et-Labrador NL ``` --- ## Map of Canada ```r #Set theme options: theme_set(theme_grey() + theme(axis.text=element_blank(), axis.ticks=element_blank(), axis.title.x=element_blank(), axis.title.y=element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank(), panel.background = element_blank(), legend.position="none")) ggplot(canada_map, aes(long, lat, group = group)) + geom_polygon(color="black", fill="white") ``` --- ## Map of Canada <img src="11_maps_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- ## Map of the US ```r us_counties<-map_data("maps::county") ggplot(us_counties, aes(long, lat, group = group)) + geom_polygon(color="black", fill="white") us_states<-map_data("state") ggplot(us_counties, aes(long, lat, group = group)) + geom_polygon(color="grey", fill="white") + geom_path(data=us_states, aes(long, lat, group = group),color="black", size=1)+ coord_map("albers",lat0=39, lat1=45) ``` --- ## Map of the US <img src="11_maps_files/figure-html/unnamed-chunk-8-1.png" width="600px" style="display: block; margin: auto;" /> --- ## Map of the US <img src="11_maps_files/figure-html/unnamed-chunk-9-1.png" width="600px" style="display: block; margin: auto;" /> --- ## Your Turn 1. Check out the help file for the `map_data` function to find out how to make a map of the world. 2. Can you figure out how to remove Antarctica from the map? <img src="11_maps_files/figure-html/unnamed-chunk-10-1.png" width="400px" style="display: block; margin: auto;" /> --- ## Choropleth Maps - Choropleth maps are thematic maps: areas are shaded by the values of a variable - Must join map data with content data. - Example: a map of US counties that shows the number of Covid-19 cases. --- ## Join Data on Covid and Map ``` ## # A tibble: 6 × 16 ## FIPS state state_name county month clint…¹ trump…² tot_v…³ trump…⁴ medin…⁵ ## <dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1001 AL Alabama autauga … 4 5908 18110 24661 0.495 55317 ## 2 1001 AL Alabama autauga … 5 5908 18110 24661 0.495 55317 ## 3 1001 AL Alabama autauga … 6 5908 18110 24661 0.495 55317 ## 4 1001 AL Alabama autauga … 7 5908 18110 24661 0.495 55317 ## 5 1001 AL Alabama autauga … 8 5908 18110 24661 0.495 55317 ## 6 1001 AL Alabama autauga … 9 5908 18110 24661 0.495 55317 ## # … with 6 more variables: urb2010 <dbl>, population <dbl>, confirmed <dbl>, ## # new_confirmed <dbl>, deaths <dbl>, new_deaths <dbl>, and abbreviated ## # variable names ¹clinton2016, ²trump2016, ³tot_votes2016, ⁴trumpmarg, ## # ⁵medinc1317 ``` ``` ## long lat group order region subregion ## 1 -86.50517 32.34920 1 1 alabama autauga ## 2 -86.53382 32.35493 1 2 alabama autauga ## 3 -86.54527 32.36639 1 3 alabama autauga ## 4 -86.55673 32.37785 1 4 alabama autauga ## 5 -86.57966 32.38357 1 5 alabama autauga ## 6 -86.59111 32.37785 1 6 alabama autauga ``` --- ## Prepare the Datasets for Joining - Will be joining by *state name* and *county name* - Must make sure that these are formatted the same in both datasets, e.g. capitalization, the word "county" - Will introduce a new variable `state_name` that is all lower case - Can see mismatching cases using `anti_join` - Also, since a map can only show one temporal period, need to aggregate the `covid` data to the county level (get rid of the temporal dimension) --- ## Prepare the Datasets for Joining ```r library(stringr) data("covid") #from the classdata package covid_data <-covid %>% filter(month==10) %>% select(-new_confirmed, -new_deaths) %>% mutate(state_name=tolower(state_name), county=str_replace(county, "county",""), county=trimws(county)) nomatch1 <- covid_data %>% anti_join(us_counties, by=c("state_name"="region", "county"="subregion")) unique(cbind(nomatch1$state_name,nomatch1$county)) nomatch2 <- us_counties %>% anti_join(covid_data, by=c("region"="state_name", "subregion"="county")) unique(cbind(nomatch2$region,nomatch2$subregion)) ``` --- ## Join the Data and Make a Map ```r library(viridis) us_counties %>% left_join(covid_data, by=c("region"="state_name", "subregion"="county")) %>% ggplot(aes(long, lat, group = group, fill=log10(confirmed/population))) + geom_polygon(color="black") + scale_fill_viridis(na.value="white")+ coord_map("albers",lat0=39, lat1=45) ``` <img src="11_maps_files/figure-html/unnamed-chunk-14-1.png" width="50%" style="display: block; margin: auto;" /> --- ## Your Turn Correct as many "gray" counties as you can. Feel free to consult the [`stringr` cheatsheet](https://evoldyn.gitlab.io/evomics-2018/ref-sheets/R_strings.pdf) <img src="11_maps_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> --- ## Canada Election Results ```r library(tidyverse) library(magrittr) data("federal_election_results") federal_election_results %>% as.data.frame() %>% dplyr::filter(election_year=="2015")->electdata canada_ridings<-mapcan(boundaries = "ridings", type="standard",territories=TRUE) head(canada_ridings) canada_ridings %>% left_join(electdata, by="riding_code") %>% ggplot(aes(long, lat, group = group, fill=factor(party)))+ geom_polygon(color="black") +scale_fill_discrete("Party", type="qual") + theme(legend.position="bottom") ``` --- ## Canada Election Results <img src="11_maps_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" />