PPOL 502-07: Reg. Methods for Policy Analysis

class: center, middle, inverse, title-slide

# PPOL 502-07: Reg. Methods for Policy Analysis
## (Optional) Midterm Review
### Alexander Podkul, PhD
### Spring 2022

---

name: contents-slide
## Midterm Review

__Regression basics__
- [Fitted values and residuals](#fitted-values-slide)
- [Goodness of fit metrics](#gof-slide)
- [Standard errors](#se-slide)
- [Interpretation](#int1-slide)

__Inferential statistics__
- [Degrees of freedom](#df-slide)
- [Testing alternative hypotheses](#alt-slide)
- [F-tests](#f-slide)

__Regression not-basics__
- [Omitted Variable Bias](#ovb-slide)
- [Interactions](#interact-slide)
- [Quadratics](#quadratic-slide)

---
name: fitted-values-slide
## Regression basics
### [Fitted values and residuals](#contents-slide)

Let's say we want to _regress_ our variable `$y$` on some other variable `$x$` 
- Note the language that when we regress `$y$` on `$x$` that means `$y$` is our dependent variable and `$x$` is our independent variable (in Stata-speak: `reg y x`) because we are projected `$y$` onto `$x$`.

--
If we aim to estimate the model: 
`$$y_i = \beta_0 + \beta_1x_i + u_i$$`
we use the data we _do_ have ( `$y$` and `$x$`)  in order to estimate `$\beta_0$` and `$\beta_1$` (and by extension, `$u_i$`)

--
- Fitted value -- often `$\hat{y_i}$` -- refers to the prediction for `$y_i$` when `$x = x_i$` where `$\hat{y_i} = \hat{\beta_0} + \hat{\beta_1}x_i$`
- Residual -- often `$u_i$` or `$y_i - \hat{y_i}$` -- refers to the difference between our fitted value and the actual value in our data.

---
### [Fitted values and residuals](#contents-slide)
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-1-1.png" style="display: block; margin: auto;" />

---
### [Fitted values and residuals](#contents-slide)
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" />

---
### [Fitted values and residuals](#contents-slide)
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" />

---
### [Fitted values and residuals](#contents-slide)
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" />

---
name: gof-slide
### [Goodness of fit](#contents-slide)

__Goodness of fit__ metrics generally review to an overview of the full regression. That is, we're not looking at how `$x_j$` might effect `$y$` but rather how all of the independent variables effect `$y$`.

--
We've spoken about two metrics:

- `$R^2$` - the "coefficient of determination" explains the fraction of sample variation in `$y$` explained by `$X$` (measured as a proportion between 0 to 1 where a value closer to 1 shows better fit)
    - `$Adj. R^2$` - a version of `$R^2$` that is "corrected" to account for multiple regression models 
- _Standard error of the regression_ - also referred to as the root mean squared error helps explain the standard deviation in the unobservables affecting `$y$` (measured in the units of `$y$` where a value closer to 0 shows a better fit)

---
### Goodness of fit
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" />

---
#### R-squared

`$$R^2 = \frac{SSE}{SST} = 1 - \frac{SSR}{SST}$$`

where:

- `$SST$` - total sum of squares - `$\sum(y_i - \bar{y})^2$` 
- `$SSE$` - explained sum of squares - `$\sum(\hat{y_i} - \bar{y})^2$`
- `$SSR$` - residual sum of squares - `$\sum\hat{u}^2$`

--
... Units?

---
#### Adj. R-squared

`$$\bar{R^2} = 1 - \frac{\frac{SSR}{n - k -1}}{\frac{SST}{n - 1}}$$`

`$$\bar{R^2} = 1 - \frac{(1 - R^2)(n - 1)}{n - k - 1}$$`

--
where:

- `$R^2$` - the un-adjusted value of `$R^2$`
- `$n$` - the number in our sample size
- `$k$` - the number of parameters in our regression equation

---
#### SER
In the bivariate case: 
`$$\hat{\sigma} = \sqrt{\frac{1}{n-2}\sum\hat{u^2}}$$`

--
In the multiple variables case: 
`$$\hat{\sigma} = \sqrt{\frac{1}{n-k-1}\sum\hat{u^2}}$$`

--
where:

- `$\hat{\sigma}$` - the standard error of the regression 
- `$\hat{u_i}$` - the estimated residual 
- `$n$` - the sample size 
- `$k$` - the number of parameters we estimate in the model

--
... Units?

---
name: se-slide
### [Standard Errors](#contents-slide)

So we've calculated standard error as:

`$$se(\hat{\beta_j}) = \frac{\hat{\sigma}}{\sqrt{n} * sd(x_j) * \sqrt{1 - R^2_j}}$$`

--
where:

- `$\hat{\sigma}$` - the standard error of the regression (SER)
- `$n$` - the sample size
- `$sd(x_j)$` - the standard deviation of `$x_j$`
- `$R^2_j$` - the `$R^2$` calculating from regression `$_j$` on all other independent variables

--
So what is this telling us? "an estimate of the standard deviation in the sampling distribution of `$\hat{\beta_j}$` (recall the end of last week's class)

---
name: int1-slide
### [Interpretation](#contents-slide)

When interpreting regression coefficients, we want to remember the basic formula that:

<span style="color:red">for every one-unit change in `$x_j$`, </span><span style="color:blue">the dependent variable will increase by `$\beta_j$` </span><span style="color:green">holding `$x_{-j}$` constant</span>

--
<span style="color:red">for every one-unit change in `$x_j$`, </span>

- this part of the expression will always be the same but it's important to note that what we mean by "one unit" may change, especially if we have transformed our variable `$x_j$` at all 
- e.g. in a level-log regression, where `$x_j$` is transformed into `$log(x_j)$` "one unit" is equal to a one percent increase in `$x$`
- e.g. in a standardized regression, where `$x_j$` is transformed into `$\frac{x_{i,j} - \bar{x_j}}{\sigma_{x_{j}}}$`, "one unit" is "one standard deviation"

--
<span style="color:blue">the dependent variable will increase by `$\beta_j$` </span>

- note that `$\beta_j$` is _in the units of the dependent variable
- need to identify if regression is standardized or logged at all

--
<span style="color:green">holding `$x_{-j}$` constant</span>

- while not necessary to include in a bi-variate case (since we're not holding anything constant) in multiple linear regression this is necessary for us to properly contextualize the expected effect 
- we can also say things like _ceteris paribus_ (with other conditions remaining the same)

---
name: df-slide
## Inferential statistics
### [Degrees of freedom](#contents-slide)

In our use case, `$degrees of freedom$` are used for calculating goodness of fit metrics as well as for identifying the proper probability distribution in hypothesis testing.

When identifying a t-distribution (think: hypothesis tests related to `$\beta_j$`)

- `$df = n - k - 1$` where `$n$` is our number of observations and `$k$` are the estimated number of parameters (see Wooldridge 3.57)

--
When identifying an F-distribution (think: multiple linear restrictions), we specify F($q$, `$n-k-1$`)

- `$q$` - the difference in degrees of freedom between the unrestricted and restricted models 
- `$n - k - 1$` - observations minus parameters minus 1 (same as above)

`$$q = (n - k_{ur} - 1) - (n - k_{r} - 1)$$`

---
name: alt-slide
### [Testing alternative hypotheses](#contents-slide)

Q: Why might we want to test a hypothesis such that
`$$H_0: \beta_j = \alpha_j$$`

--
and calculate a test-statistic where: 
`$$t = \frac{\hat{\beta_j} - \alpha_j}{se(\hat{\beta_j})}$$`

A: We may have a substantively interesting value that we might expect `$\beta_j$` to be. For example, in the shoe example (left and right shoe sizes), if we were to regress LSS on RSS, we would likely expect `$\beta_j$` to be 1 not 0.

---
name: f-slide
### [F-Tests](#contents-slide)

Why do we have two F-test equations? 
`$$\frac{(SSR_r - SSR_{ur})/q}{SSR_{ur}/(n - k - 1)}$$`

`$$\frac{({R^2}_r - R^2_{ur})/q}{1 - R^2_{ur}/(n - k - 1)}$$`

--
This is because we can make the following substitutions:

- `$SSR_r = SST(1 - R^2_r)$`
- `$SSR_{ur} = SST(1 - R^2_{ur})$`

---
name: ovb-slide
##Regression not-basics 
### [Omitted Variable Bias](#contents-slide)

__Omitted variable bias__ describes how our coefficients will be systematically _biased_, that is `$E(\hat{\beta_j}) \neq \beta_j$`, due to an improper model specification.

OVB occurs when:

1. some variable not included in the model `$z$` correlates with our dependent variable `$y$`
2. the same variable `$z$` correlates with at least one independent variable `$x$` in the model

--
and can be expressed as
`$$\tilde{\beta_j} = \hat{\beta_1} + \delta_1\hat{\beta_2}$$`

--
where:

- __Full regression__ - `$y = \hat{\beta_0} + \hat{\beta_1}x_1 + \hat{\beta_2}x_2 + \hat{u}$`
- __Partial regression__ - `$y = \tilde{\beta_0} + \tilde{\beta_1}x_1 + \tilde{u}$`
- __Auxiliary regression__ - `$x_2 = \delta_0 + \delta_1x_1 + e$`

---
### Omitted Variable Bias

<table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;">
<caption>Statistical models</caption>
<thead>
<tr>
<th style="padding-left: 5px;padding-right: 5px;">&nbsp;</th>
<th style="padding-left: 5px;padding-right: 5px;">Admit Chance</th>
<th style="padding-left: 5px;padding-right: 5px;">Admit Chance</th>
<th style="padding-left: 5px;padding-right: 5px;">GRE Score</th>
</tr>
</thead>
<tbody>
<tr style="border-top: 1px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">Intercept</td>
<td style="padding-left: 5px;padding-right: 5px;">-2.156<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.638<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">309.492<sup>***</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.139)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.009)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.695)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">Research</td>
<td style="padding-left: 5px;padding-right: 5px;">0.038<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.158<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">13.362<sup>***</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.010)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.012)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.940)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">GRE Score</td>
<td style="padding-left: 5px;padding-right: 5px;">0.009<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.000)</td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
</tr>
<tr style="border-top: 1px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">R<sup>2</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.656</td>
<td style="padding-left: 5px;padding-right: 5px;">0.306</td>
<td style="padding-left: 5px;padding-right: 5px;">0.337</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">Adj. R<sup>2</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.654</td>
<td style="padding-left: 5px;padding-right: 5px;">0.304</td>
<td style="padding-left: 5px;padding-right: 5px;">0.335</td>
</tr>
<tr style="border-bottom: 2px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td>
<td style="padding-left: 5px;padding-right: 5px;">400</td>
<td style="padding-left: 5px;padding-right: 5px;">400</td>
<td style="padding-left: 5px;padding-right: 5px;">400</td>
</tr>
</tbody>
<tfoot>
<tr>
<td style="font-size: 0.8em;" colspan="4"><sup>***</sup>p &lt; 0.001; <sup>**</sup>p &lt; 0.01; <sup>*</sup>p &lt; 0.05</td>
</tr>
</tfoot>
</table>

--
`$$\tilde{\beta_j} = \hat{\beta_1} + \delta_1\hat{\beta_2}$$`

--
`$$0.158 = 0.038 + 13.362(.009)$$`

---
name: interact-slide
### [Interactions](#contents-slide)

__Interaction models__ allow us to test _conditional hypotheses_ where the effect of `$x_1$` on `$y$` might change depending on the value of `$x_2$` (note: we're not holding `$x_2$` constant, we're saying the _value_ of `$x_2$` matters)

For example we can estimate the model: 
`$$y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \beta_3x_1x_2$$`

--
In a toy example where `$x_1$` is continuous and `$x_2$` is dichotomous, we can begin to explore how the difference in the relationship between `$x_1$` on `$y$` changes depending on the value of `$x_2$`:

`$$\hat{y} = 1 + 2.3x_1 + 5x_2 - 10x_1x_2$$`

--
when `$x_2 = 0$`: 
- `$\hat{y} = 1 + 2.3x_1 + 5(0) - 10x_1(0)$`
- `$\hat{y} = 1 + 2.3x_1$`

--
when `$x_2 = 1$`: 
- `$\hat{y} = 1 + 2.3x_1 + 5(1) - 10x_1(1)$`
- `$\hat{y} = 6 - 7.7x_1$`

---
### Interactions

Let's say we have a real type example. How might we _interpret_ it? 
<table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;">
<caption>Statistical models</caption>
<thead>
<tr>
<th style="padding-left: 5px;padding-right: 5px;">&nbsp;</th>
<th style="padding-left: 5px;padding-right: 5px;">Admit Chance</th>
<th style="padding-left: 5px;padding-right: 5px;">Admit Chance</th>
</tr>
</thead>
<tbody>
<tr style="border-top: 1px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">Intercept</td>
<td style="padding-left: 5px;padding-right: 5px;">-2.156<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">-1.686<sup>***</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.139)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.219)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">GRE Score</td>
<td style="padding-left: 5px;padding-right: 5px;">0.903<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.751<sup>***</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.045)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.071)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">Research</td>
<td style="padding-left: 5px;padding-right: 5px;">0.038<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">-0.755<sup>**</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.010)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.287)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">GRE Score * Research</td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">0.252<sup>**</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.091)</td>
</tr>
<tr style="border-top: 1px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">R<sup>2</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.656</td>
<td style="padding-left: 5px;padding-right: 5px;">0.662</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">Adj. R<sup>2</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.654</td>
<td style="padding-left: 5px;padding-right: 5px;">0.660</td>
</tr>
<tr style="border-bottom: 2px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td>
<td style="padding-left: 5px;padding-right: 5px;">400</td>
<td style="padding-left: 5px;padding-right: 5px;">400</td>
</tr>
</tbody>
<tfoot>
<tr>
<td style="font-size: 0.8em;" colspan="3"><sup>***</sup>p &lt; 0.001; <sup>**</sup>p &lt; 0.01; <sup>*</sup>p &lt; 0.05</td>
</tr>
</tfoot>
</table>

---
### Interactions
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" />

---
### Interactions
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" />

---
### Interactions
<table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;">
<caption>Statistical models</caption>
<thead>
<tr>
<th style="padding-left: 5px;padding-right: 5px;">&nbsp;</th>
<th style="padding-left: 5px;padding-right: 5px;">Admit Chance</th>
<th style="padding-left: 5px;padding-right: 5px;">Admit Chance</th>
</tr>
</thead>
<tbody>
<tr style="border-top: 1px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">Intercept</td>
<td style="padding-left: 5px;padding-right: 5px;">-2.156<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">-1.686<sup>***</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.139)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.219)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">GRE Score</td>
<td style="padding-left: 5px;padding-right: 5px;">0.903<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.751<sup>***</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.045)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.071)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">Research</td>
<td style="padding-left: 5px;padding-right: 5px;">0.038<sup>***</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">-0.755<sup>**</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.010)</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.287)</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">GRE Score * Research</td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">0.252<sup>**</sup></td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">&nbsp;</td>
<td style="padding-left: 5px;padding-right: 5px;">(0.091)</td>
</tr>
<tr style="border-top: 1px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">R<sup>2</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.656</td>
<td style="padding-left: 5px;padding-right: 5px;">0.662</td>
</tr>
<tr>
<td style="padding-left: 5px;padding-right: 5px;">Adj. R<sup>2</sup></td>
<td style="padding-left: 5px;padding-right: 5px;">0.654</td>
<td style="padding-left: 5px;padding-right: 5px;">0.660</td>
</tr>
<tr style="border-bottom: 2px solid #000000;">
<td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td>
<td style="padding-left: 5px;padding-right: 5px;">400</td>
<td style="padding-left: 5px;padding-right: 5px;">400</td>
</tr>
</tbody>
<tfoot>
<tr>
<td style="font-size: 0.8em;" colspan="3"><sup>***</sup>p &lt; 0.001; <sup>**</sup>p &lt; 0.01; <sup>*</sup>p &lt; 0.05</td>
</tr>
</tfoot>
</table>

--
__What's the marginal effect of `$GRE$` on `$Chance of Admit$`?__
`$$\frac{\partial}{\partial_{GRE}} \beta_0 + \beta_1GRE + \beta_2Research + \beta3GRE*Research$$`
`$$\beta_1 + \beta_3Research = .751 + 0.252Research$$`

__How do we interpret these results?__
- Considering the interaction term is statistically significant (i.e. statistically distinguishable from 0) we can conclude that there is a moderating effect in our data. 
- However we often compute the average marginal effect (that is the average effect across the range of some variable within our data). 
- This is why we may want to mean center our variable because it will report the average marginal effect

---
name: quadratic-slide
### [Quadratics](#contents-slide)

__Quadratic__ regressions are a special type of case of interaction models where the effect of `$x$` on `$y$` changes based on various levels of `$x$` 
`$$y = \beta_0 + \beta_1x_1 + \beta_2x_1x_1$$`
`$$y = \beta_0 + \beta_1x_1 + \beta_2x_1^2$$`

--
which allows us to explore relationships such as: 
<img src="Midterm_Review_Slides_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" />

---
### Quadratics

__What's the marginal effect of `$x$` on `$y$`?__
`$$\frac{\partial}{\partial_{x}} \beta_0 + \beta_1x + \beta_2x^2$$`
`$$\beta_1 + 2\beta_2x$$`

--
__How do we interpret these results?__
- Look at the statistical significance of the quadratic term to assess whether there is a quadratic relationship 
- Similar to interaction effects we can also explore average marginal effect