POL 304: Using Data to Understand Politics and Society

POL 304: Using Data to Understand Politics and Society
Intro to Causality
Olga Chyzh [www.olgachyzh.com]
1 / 25

Reading

Imai, Kosuke. Quantitative Social Science, Chapter 2 Sections 2.1, 2.3, and 2.4

2 / 25

Today's Class

How to estimate causal effects with social science data?
Two examples:
- breakfast cereal example
- health savings experiment

3 / 25

Very Healthy Cereal

4 / 25

Empirical Observation

This cereal sticks to a magnet.

5 / 25

Why Did it Stick?Propose a theoretical model (that includes a causal mechanism) to explain the observation.
6 / 25

Why Did it Stick?

Propose a theoretical model (that includes a causal mechanism) to explain the observation.
Suppose our causal mechanism is that this cereal sticks to a magnet because of its iron content.
A testable hypothesis: cereals that are high in iron stick to a magnet.

6 / 25

Test

7 / 25

Does It Stick?

This cereal does not stick to a magnet.

8 / 25

Causal Effect

Unit of analysis: cereal type
Treatment variable (causal variable of interest) $T$ : iron content (high/low)
Treatment group (treated units): Total cereal
Control group (untreated units): Honey Smacks cereal
Outcome variable (response variable) $Y$ : whether it sticks to a magnet (yes/no).

9 / 25

Causality

Counterfactual: Would Total cereal stick to a magnet if it did not have iron?
Two potential outcomes: $Y (1)$ and $Y (0)$
Causal effect: $Y (1) - Y (0)$
Fundamental problem of causal inference: only one of the two potential outcomes is observable.
Cannot calculate individual causal effect $Y (1) - Y (0)$ !
Importance of control group.

10 / 25

Can We Approximate Counterfactuals

Association is not causation! (there could be confounders)
Confounding variable is a pre-preatment variable that is associated with both the treatment and the outcome and may bias our estimation of the treatment effect.
Matching: Find a unit that is the same as the those in the treatment group, except for the treatment.
Is Honey Smacks a good match for Total cereal?

11 / 25

Can We Approximate Counterfactuals

Association is not causation! (there could be confounders)
Confounding variable is a pre-preatment variable that is associated with both the treatment and the outcome and may bias our estimation of the treatment effect.
Matching: Find a unit that is the same as the those in the treatment group, except for the treatment.
Is Honey Smacks a good match for Total cereal?
Find another healthy cereal that has the same ingredients (other than iron), texture, etc.

11 / 25

Strategy 1: Matching (A Quasi-Experiment)

NJ increased the minimum wage. Does increase in min wage lead to unemployment?
- Find a similar state to NJ that did not increase min wage.
- Match to rule out potential confounders: e.g., compare only Burger Kings in urban areas
Are Black people less likely to get job offers?
- Find a white person who has the exact same credentials as a black person.

12 / 25

Problem

Cannot match on everything
Unobserved confounders: variables associated with treatment and outcome
- Selection bias is confounding bias due to participant self-selection into the treatment/control groups.
- Can you think of examples?

13 / 25

Your Turn

Suppose we are interested in the causal effect of alcohol on depression
Propose a research design, such that:
- units in the treatment group engage in high alcohol consumption, while units in the control group do not
- the treatment and the control group do not differ in other ways that may be correlated with alcohol consumption and depression
- units did not self-select into treatment/control groups

14 / 25

Randomized Control Trials (RCT)

Random assignment of participants (observations) by the researcher into the treatment/control groups.
- How would you do this in practice?
Key idea: Randomization of the treatment makes the treatment and control groups “identical” on average
The two groups are similar in terms of all (both observed and unobserved) characteristics
Can attribute the average differences in outcome to the difference in the treatment

$Sample Average Treatment Effect (SATE) = \frac{1}{n} \sum_{i = 1}^{n} {Y_{i} (1) - Y_{i} (0)}$

SATE is not observable, but can estimate as $\bar{Y} (1) - \bar{Y} (0)$
Randomized experiments are the gold standard
Double-blind experiments: Placebo effects and Hawthorn effects

15 / 25

Hawthorn effect is changing behavior because you are being studied.

16 / 25

Using Technology to Increase Savings

Question: How to encourage people to save for emergency healthcare?¹
A small amount of investments in preventative health products (e.g., bed nets, water filters) can save many lives in developing countries
Hypothesis: simple saving technologies can increase investments
A randomized field experiment in Kenya
- control: encouraged to save, no device given
- T1: metal safe box with the key given to participants
- T2: same box but the key given to officers, participants must ask officers to open the box at a shop
Outcome: amount of savings for health products 6 and 12 months later

¹ Dupas, Pascaline and Jonathan Robinson. 2013. “Why Don’t the Poor Save More? Evidence from Health Savings Experiments.” American Economic Review, Vol. 103, No. 4, pp. 1138-1171.

17 / 25

Study Design

Randomized treatment
- 111 control, 195 lockbox, 117 safe box
Outcome measured in follow-up surveys after 6 and 12 months
- 102 are in the control group, 184 have received a locked box, 107 have received a safe box.
Why have a control group rather than compare the treatment groups to the savings rates in the population?
Does the drop-out rate differ across the treatment conditions? What does this result suggest about the internal and external validity of this study?

18 / 25

Internal and External Validity

Internal validity--- the extent to which causal assumptions are satisfied in the study.
External validity---the extent to which the conclusions can be generalized beyond a particular study.

19 / 25

Calculate SATE

Control: $\bar{Y} (0) = 257.83$

Lockbox: ${\bar{Y}}^{T_{1}} (1) = 307.83$

Safe box: ${\bar{Y}}^{T_{2}} (1) = 408.22$

$S A T E_{T_{1}} = 307.83 - 257.83 = 50$

$S A T E_{T_{2}} = 408.22 - 257.83 = 150.39$

20 / 25

Examine the Balance

If randomization "took", the experimental groups should be roughly equal on the pre-treatment variables (e.g., gender, marital status, age).

Gender

control	lockbox	safebox
0.73	0.73	0.79

Age

control	lockbox	safebox
41.87	39.58	38.54

Marital Status

control	lockbox	safebox
0.75	0.76	0.73

21 / 25

Subset Analysis

If we think that the groups are unbalanced, despite randomization, can compare means within subsets.

Married Women Only

control	lockbox	safebox
239.66	332.43	557.14

Unmarried Women Only

control	lockbox	safebox
218.54	220.47	264.04

22 / 25

What You Learned or Already Knew

How to generate a research question (an inductive approach)
How to propose a theoretical model (causal mechanism) and derive a testable hypothesis
Concepts: unit of analysis, treatment/control variable, treatment/control group, outcome variable, counterfactual, potential outcomes, causal effect, confounders, matching, selection bias, random assignment, SATE, randomized experiment, placebo effect, hawthorn effect, attrition/drop-out rate, internal and external validity, balance, subset analysis.

23 / 25

Review Questions

What was the goal of the cereal example? What did it demonstrate?
How does randomization account for confounders? What is the exact mechanism?
What is the difference between a natural experiment (quasi-experiment) and a randomized controlled trials?
What is selection effect and how do randomized controlled trials rule it out?

24 / 25

Before Next Class

Install R and RStudio
Complete assigned readings
Next Class: causality, experimental design, a two-sample t-test

25 / 25

Help

Keyboard shortcuts

↑, ←, Pg Up, k

Go to previous slide

↓, →, Pg Dn, Space, j

Go to next slide

Home

Go to first slide

End

Go to last slide

Number + Return

Go to specific slide

b / m / f

Toggle blackout / mirrored / fullscreen mode

Clone slideshow

Toggle presenter mode

Restart the presentation timer

?, h

Toggle this help