class: center, middle, inverse, title-slide # PPOL 502-07: Reg. Methods for Policy Analysis ## Week 10: Binary Dependent Variables (2 of 2) ### Alexander Podkul, PhD ### Spring 2022 --- ## Today's Class Outline * Course Schedule * Reviewing Last Week * Expanding to Probit * More on Interpretation * Maximum Likelihood Estimation * Hypothesis Testing * Goodness of Fit Metrics * An Extended Example * Where We're Going Next * __Break__ * A Word on Data Project Complications * Working in Stata --- ## Course Schedule __Tonight (3/30)__ * Problem set #5 due __Next Week (4/6)__ * Data Project: check-in #2 due (on Canvas) --- ## Other Course Notes __Course survey__ * Thank you for those who took the time to fill it out! * I'll especially take into consideration the comments about the midterm (looking ahead to the final exam) and keep you updated on any expected changes -- __Applied Research For Policy Analysis__ * Covering applications of statistical concepts * Post suggested material on Canvas (starting tonight after class) * Will cover a few works as well as other final exam review material --- ## Reviewing Last Week: LPM Last week, we covered a number of topics for exploring _binary dependent variables_, which can be useful for detecting the presence or absence of a particular attribute. -- First, we spoke about the __linear probability model__, which is a simple adaptation of the standard multiple linear regression model where `\(y\)` happens to be a binary measure. Although interpretation of this model is simple (since `\(\Delta P(y=1|x) = \beta_j\Delta x_j\)`), there are two shortcomings: -- * potentially nonsensical predictions/fitted values, where `\(\hat{y_i} <0\)` or `\(\hat{y_i} >1\)` * by definition (since `\(y_i\)` only takes on values of 0 or 1) the model will have significant heteroskedasticity --- ## Reviewing Last Week: LVM We then pivoted to covering the __latent variable model__ which models an unobserved (latent) variable representing a continuous metric for exploring when `\(Y = 1\)` and `\(Y=0\)` -- After some manipulation, this left us with the workhorse model of: `$$P(y = 1|x) = G(\beta_0 + \beta_1x_1 + \beta_2x_2 + ... + \beta_kx_k)$$` -- where `\(G(z)\)` is our _link function_ --- ## Reviewing Last Week: Logit We also discussed the __logit model__, which uses the following _link function_: `$$G(z) = \frac{e^z}{1 + e^z} = \Lambda(z)$$` -- this logit function, which can be expressed via a number of different equations, has the useful feature of being bounded between 0 and 1 like: <img src="Week10_files/figure-html/unnamed-chunk-1-1.png" style="display: block; margin: auto;" /> ... where errors are distributed according to a logistic distribution --- ## Reviewing Last Week: Logit (Cont.) Given this non-linear relationship (in terms of odds and probability), we also discussed the following notes: 1. Not estimated via OLS (more on this tonight) 2. Shifting from t-tests to z-tests (more on this tonight) 3. Goodness of fit metrics (more on this tonight) 4. The effect of `\(x_j\)` is going to depend on the value of `\(x_j\)` 5. The effect of `\(x_j\)` is going to depend on the value of other independent variables -- which is (unfortunately) going to affect: - how we conduct hypothesis tests -- - how we consider how our model is estimated -- - how we consider how _well_ our model is estimated -- - how we _interpret_ our model (possible tools: PEA, AME, and Observed value, Discrete Differences) --- ## Introducing Probit __Probit__ refers to another category of binary dependent variable models with a different link function for `\(G(z)\)` In these models, `\(G(z)\)` is: `$$G(z) = \Phi(z) = \int_{-\infty}^{z} \phi(v) dv$$` where `\(\phi(v)\)` is the standard normal density: -- `$$\phi(z) = (2\pi)^{1/2}e^{-z^2/2}$$` and the error is distributed according to a standard normal distribution. --- ## Introducing Probit <img src="Week10_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> --- ## Introducing Probit <img src="Week10_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- ## Introducing Probit <img src="Week10_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- ## Introducing Probit <img src="Week10_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- ## Introducing Probit <img src="Week10_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- ## Introducing Probit <img src="Week10_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- ## Introducing Probit Probit models are set up and interpreted in similar fashion to the logit model. For example, in the following set up: `$$P(Y_i = 1) = \Phi(\beta_0 + \beta_1 x_1 + \beta_2 x_2)$$` where `\(\Phi(z)\)`, representing the standard normal cumulative distribution function, is standing in for the link function `\(G(z)\)`. -- For example, imagine the following estimated equation `$$P(Y_i = 1) = \Phi(0.05 + 1x_1 + -2 x_2)$$` -- If we want to assess the predicted probability that `\(Y_i = 1\)` when `\(x_1 = 4\)` and `\(x_2 = 1.5\)`, we can solve: `$$P(Y_i = 1) = \Phi(0.05 + 1(4) + -2(1.5))$$` -- `$$P(Y_i = 1) = \Phi(1.05)$$` -- `$$P(Y_i = 1) = .85$$` --- ## Introducing Probit: Example <table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;"> <caption>Statistical models</caption> <thead> <tr> <th style="padding-left: 5px;padding-right: 5px;"> </th> <th style="padding-left: 5px;padding-right: 5px;">LPM</th> <th style="padding-left: 5px;padding-right: 5px;">Logit</th> <th style="padding-left: 5px;padding-right: 5px;">Probit</th> </tr> </thead> <tbody> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Intercept</td> <td style="padding-left: 5px;padding-right: 5px;">0.54</td> <td style="padding-left: 5px;padding-right: 5px;">0.21</td> <td style="padding-left: 5px;padding-right: 5px;">0.40</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.34)</td> <td style="padding-left: 5px;padding-right: 5px;">(1.87)</td> <td style="padding-left: 5px;padding-right: 5px;">(1.10)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Union Density</td> <td style="padding-left: 5px;padding-right: 5px;">0.04<sup>**</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.27<sup>**</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.15<sup>**</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.10)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.05)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">South</td> <td style="padding-left: 5px;padding-right: 5px;">-0.09</td> <td style="padding-left: 5px;padding-right: 5px;">-0.27</td> <td style="padding-left: 5px;padding-right: 5px;">-0.17</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.16)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.82)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.49)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Unemployment</td> <td style="padding-left: 5px;padding-right: 5px;">-0.08</td> <td style="padding-left: 5px;padding-right: 5px;">-0.53</td> <td style="padding-left: 5px;padding-right: 5px;">-0.35</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.06)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.38)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.22)</td> </tr> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">R<sup>2</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.28</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Adj. R<sup>2</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.23</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td> <td style="padding-left: 5px;padding-right: 5px;">50</td> <td style="padding-left: 5px;padding-right: 5px;">50</td> <td style="padding-left: 5px;padding-right: 5px;">50</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">AIC</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">59.90</td> <td style="padding-left: 5px;padding-right: 5px;">60.14</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">BIC</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">67.55</td> <td style="padding-left: 5px;padding-right: 5px;">67.79</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Log Likelihood</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">-25.95</td> <td style="padding-left: 5px;padding-right: 5px;">-26.07</td> </tr> <tr style="border-bottom: 2px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Deviance</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">51.90</td> <td style="padding-left: 5px;padding-right: 5px;">52.14</td> </tr> </tbody> <tfoot> <tr> <td style="font-size: 0.8em;" colspan="4"><sup>***</sup>p < 0.001; <sup>**</sup>p < 0.01; <sup>*</sup>p < 0.05</td> </tr> </tfoot> </table> --- ## Introducing Probit: Example <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:right;"> Union Density </th> <th style="text-align:left;"> South </th> <th style="text-align:right;"> Unemployment </th> <th style="text-align:right;"> Logit Pred. </th> <th style="text-align:right;"> Probit Pred. </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 10.45 </td> <td style="text-align:left;"> Nonsouth </td> <td style="text-align:right;"> 5.25 </td> <td style="text-align:right;"> 0.5593598 </td> <td style="text-align:right;"> 0.5564192 </td> </tr> <tr> <td style="text-align:right;"> 10.45 </td> <td style="text-align:left;"> South </td> <td style="text-align:right;"> 5.25 </td> <td style="text-align:right;"> 0.4918881 </td> <td style="text-align:right;"> 0.4897414 </td> </tr> <tr> <td style="text-align:right;"> 25.00 </td> <td style="text-align:left;"> South </td> <td style="text-align:right;"> 4.60 </td> <td style="text-align:right;"> 0.9852026 </td> <td style="text-align:right;"> 0.9923343 </td> </tr> </tbody> </table> --- ## Introducing Probit: Example <img src="Week10_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> --- ## Introducing Probit: Example <img src="Week10_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- ## Reviewing Interpretation The bad news?: Probit coefficients suffer the same problems as logit models in that they are tricky to interpret. -- The good news? We can use the same tools that we discussed last week. -- To review: * partial effect at the average * average marginal effect * observed-value, discrete differences --- ## Reviewing Interpretation: Probit Example We can explore observed-value, discrete differences in the probit set-up. -- Let's follow the same model as last week for assessing the effect of `\(South\)` by calculating the difference between our predicted probabilities for the observed values for when south = 1 and south = 1 such that... -- `$$P_{0i} = \Phi(0.40 + 0.15union_i + -0.17(0) + -0.35Unemploy_i)$$` -- `$$P_{1i} = \Phi(0.40 + 0.15union_i + -0.17(1) + -0.35Unemploy_i)$$` -- and by identifying the average difference between: `$$P_{1i} - P_{0i}$$` -- `$$-0.05$$` --- ## Maximum Likelihood Estimation Last week, we mentioned that logit (and probit) models are no longer estimated via OLS (ordinary least squares). Instead, these models are fit using __maximum likelihood estimation__ (or, MLE). This estimation technique is produced via an iterative process in order for us to identify the coefficient estimates. In MLE, the iterative process will test a variety of relationships that exist in the data and identifies the estimated relationship by maximizing the likelihood of observing each relationship. -- In other words, we're trying to ascertain probability of observing the data we observe. -- In a very silly example, let's say we randomly speak to 3 Georgetown graduate students. If we identify that 2 of those students are from McCourt, we might ask "what is the likelihood of observing that combination?", or: `$$L = p_{McCourt} * p_{McCourt} * (1-p_{McCourt}) = p_{McCourt}^2 - p_{McCourt}^3$$` --- ## Maximum Likelihood Estimation To "solve" this problem, we can simply _guess_ and _maximize_ the likelihood by picking on the value that produces the largest `\(L\)`. -- <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:right;"> p </th> <th style="text-align:right;"> L </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.3 </td> <td style="text-align:right;"> 0.063 </td> </tr> <tr> <td style="text-align:right;"> 0.6 </td> <td style="text-align:right;"> 0.144 </td> </tr> <tr> <td style="text-align:right;"> 0.9 </td> <td style="text-align:right;"> 0.081 </td> </tr> </tbody> </table> -- In this case, it's quite simple; but in a more complicated, multiple parameter problem this becomes much more complicated. This logic is extended in the estimation of models that use MLE, for example, such that: `$$L = \Phi(\beta_0 + \beta_1x_1) * \Phi(\beta_0 + \beta_1x_2) * (1-\Phi(\beta_0 + \beta_1x_3))$$` ... and that's as far as we'll go on MLE. --- ## Hypothesis Testing Similar to the hypothesis testing framework we discussed earlier in the semester, we can similarly test various types of hypotheses in binary dependent variable models. -- Remember, the general framework is: 1. Setting up a testable hypothesis 2. Identifying the significance level 3. Calculating some test-statistic (and/or corresponding p-value and/or confidence interval) 4. Make the conclusion -- In the binary dependent variable framework, we'll talk through the minor changes that we'll make to two types of inferences: single parameter and multiple hypotheses. --- ### Single Parameter Just like in OLS, we can estimate standard errors in our binary dependent variable model (which we're not going to cover). To test our null hypothesis -- e.g. `\(H_0: \beta_j = 0\)` -- we shift from using the t-statistic (OLS) to using the Wald statistic (Stata is going to display a Z-score as the Wald statistic is asymptotically distributed as a standard normal distribution), which is calcuated as: `$$W = \frac{\hat{\beta_j} - \beta_0}{\hat{se}(\hat{\beta_j})} \sim N(0,1)$$` ... from which we can compare to a critical value or calculate a p-value in the usual way. --- ### Multiple Hypotheses To test multiple hypotheses (think: similar to F-tests in OLS), we can use the __likelihood ratio statistic__ (see, Wooldridge 17.12 for more). -- With the likelihood ratio statistic, we can test either: * whether two coefficients are equal to each other, `\(H_0: \beta_1 = \beta_2\)` * whether more than one coefficient is equal to zero, `\(H_0: \beta_1 = \beta_2 = 0\)` -- To calculate the likelihood ratio statistic: `$$LR = 2[log(L_{unrestricted}) - log(L_{restricted})]$$` where: * `\(log(L_{unrestricted})\)` - the log likelihood reported from the original (unrestricted) model * `\(log(L_{restricted})\)` - the log likelihood reported from the adapted (restricted by the hypotheses) model --- ### Multiple Hypotheses: Likelihood Ratio Statistic * Estimate the full model (unrestricted model). `$$P(Y = 1) = G(\beta_0 + \beta_1x_1 + \beta_2x_2)$$` -- * Estimate the restricted model based on the null hypotheses. * If we're testing whether more than one coefficient is equal to zero, `\(H_0: \beta_1 = \beta_2 = 0\)` `$$P(Y = 1) = G(\beta_0)$$` * If we're testing whether coefficients are equal, `\(H_0: \beta_1 = \beta_2\)` `$$P(Y = 1) = G(\beta_0 + \beta_1(x_1 + x_2))$$` -- * Plug the estimated likelihoods associated with each equation into the likelihood ratio model and find the associated test-statistic, which is from a `\(\chi^2\)` distribution (where `\(df\)`= `\(K\)`) --- ### Multiple Hypotheses: Likelihood Ratio Statistic <img src="Week10_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> --- ### Multiple Hypotheses: Likelihood Ratio Statistic (Ex. 1) <img src="stata_out.png" width="1768" style="display: block; margin: auto;" /> --- ### Multiple Hypotheses: Likelihood Ratio Statistic (Ex. 2) Let's say we want to estimate an equation: `$$P(Clinton Won = 1)= \Lambda(\beta_0 + \beta_1Age + \beta_2log(PCI) + \beta_3CrimeIdx)$$` and we want to test: -- `$$H_0: \beta_2 = \beta_3 = 0$$` <table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;"> <caption>Statistical models</caption> <thead> <tr> <th style="padding-left: 5px;padding-right: 5px;"> </th> <th style="padding-left: 5px;padding-right: 5px;">Clinton Win</th> <th style="padding-left: 5px;padding-right: 5px;">Clinton Win</th> </tr> </thead> <tbody> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Intercept</td> <td style="padding-left: 5px;padding-right: 5px;">33.97<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.06</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(3.31)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.54)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Median Age</td> <td style="padding-left: 5px;padding-right: 5px;">-0.01</td> <td style="padding-left: 5px;padding-right: 5px;">-0.05<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">log(Per Capita Income)</td> <td style="padding-left: 5px;padding-right: 5px;">-3.72<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.36)</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Crime Index</td> <td style="padding-left: 5px;padding-right: 5px;">0.00<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.00)</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">AIC</td> <td style="padding-left: 5px;padding-right: 5px;">2127.86</td> <td style="padding-left: 5px;padding-right: 5px;">2248.41</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">BIC</td> <td style="padding-left: 5px;padding-right: 5px;">2151.47</td> <td style="padding-left: 5px;padding-right: 5px;">2260.21</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Log Likelihood</td> <td style="padding-left: 5px;padding-right: 5px;">-1059.93</td> <td style="padding-left: 5px;padding-right: 5px;">-1122.20</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Deviance</td> <td style="padding-left: 5px;padding-right: 5px;">2119.86</td> <td style="padding-left: 5px;padding-right: 5px;">2244.41</td> </tr> <tr style="border-bottom: 2px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td> <td style="padding-left: 5px;padding-right: 5px;">2704</td> <td style="padding-left: 5px;padding-right: 5px;">2704</td> </tr> </tbody> <tfoot> <tr> <td style="font-size: 0.8em;" colspan="3"><sup>***</sup>p < 0.001; <sup>**</sup>p < 0.01; <sup>*</sup>p < 0.05</td> </tr> </tfoot> </table> --- ### Multiple Hypotheses: Likelihood Ratio Statistic (Ex. 2) `$$LR = 2[log(L_{unrestricted}) - log(L_{restricted})]$$` -- `$$LR = 2[-1059.93 - -1122.20]$$` -- `$$LR = 124.5$$` -- `$$124.5 > 7.4$$` where 7.4 represents the critical value from a `\(\chi^2(2)\)` distribution --- ## Another Goodness of Fit Metric When dealing with OLS we dealt with a number of other goodness of fit metrics. However, `\(R^2\)` cannot be computed in the same way as in OLS regression. Some researchers will report a "pseudo- `\(R^2\)` " which is interpreted in a similar way (as it is presented from 0 to 1). There are a number of ways to calculate this value with various trade-offs. -- Stata calculates the McFadden Pseudo-$R^2$ which is calculated by: `$$\bar{R^2} = 1 - \frac{LL_{mod}}{LL_{0}}$$` where: * `\(LL_{mod}\)` is the log likelihood for the fitted model * `\(LL_{0}\)` is the log likelihood for a model without covariates (only an intercept term) --- ## Another Goodness of Fit Metric: Example <img src="full.png" width="865" style="display: block; margin: auto;" /> --- ## Another Goodness of Fit Metric: Example <img src="intercept.png" width="869" style="display: block; margin: auto;" /> -- `$$\bar{R^2} = 1 - \frac{LL_{mod}}{LL_{0}}$$` -- `$$\bar{R^2} = 1 - \frac{-11.56}{-33.87}$$` -- `$$\bar{R^2} = .6587$$` --- ## Extended Example Let's walk through a longer example exploring what we've covered. Imagine we are trying to understand the predictors associated with ideological conservatism among Americans. We can consider the following (overly simplified, improperly specified) models: -- __Linear Probability Model__ `$$P(Cons = 1|Age, White, Educ, Attend) = \beta_0 + \beta_1 Age + \beta_2 White + \beta_3 Educ + \beta_4 Attend$$` -- __Logit Model__ `$$P(Cons = 1|Age, White, Educ, Attend) = \Lambda(\beta_0 + \beta_1 Age + \beta_2 White + \beta_3 Educ + \beta_4 Attend)$$` -- __Probit Model__ `$$P(Cons = 1|Age, White, Educ, Attend) = \Phi(\beta_0 + \beta_1 Age + \beta_2 White + \beta_3 Educ + \beta_4 Attend)$$` --- ### Data and Measures We can explore this question using data from the General Social Survey (GSS) from 2012 with the following measurements: * `\(Cons\)` - a binary variable indicating that a respondent identifies as Conservative or Extremely Conservative * `\(Age\)` - age in years (continuous) * `\(White\)` - a binary variable indicating the respondent identifies as white * `\(Educ\)` - education in years (continuous) * `\(Attend\)` - a binary variable indicating the respondent identifies as attending religious services "Nearly Every Week" or more often --- ### Estimation <table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;"> <caption>Statistical models</caption> <thead> <tr> <th style="padding-left: 5px;padding-right: 5px;"> </th> <th style="padding-left: 5px;padding-right: 5px;">LPM</th> <th style="padding-left: 5px;padding-right: 5px;">Logit</th> <th style="padding-left: 5px;padding-right: 5px;">Probit</th> </tr> </thead> <tbody> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Intercept</td> <td style="padding-left: 5px;padding-right: 5px;">-0.00</td> <td style="padding-left: 5px;padding-right: 5px;">-2.95<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">-1.66<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.06)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.42)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.23)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Age (yrs.)</td> <td style="padding-left: 5px;padding-right: 5px;">0.00<sup>**</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.01<sup>**</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.01<sup>**</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.00)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.00)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.00)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">White</td> <td style="padding-left: 5px;padding-right: 5px;">0.11<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.88<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.47<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.20)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.11)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Educ (yrs.)</td> <td style="padding-left: 5px;padding-right: 5px;">-0.00</td> <td style="padding-left: 5px;padding-right: 5px;">-0.01</td> <td style="padding-left: 5px;padding-right: 5px;">-0.01</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.00)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Attend</td> <td style="padding-left: 5px;padding-right: 5px;">0.15<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.95<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.54<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.02)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.13)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.07)</td> </tr> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">R<sup>2</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.05</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Adj. R<sup>2</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.05</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td> <td style="padding-left: 5px;padding-right: 5px;">1772</td> <td style="padding-left: 5px;padding-right: 5px;">1772</td> <td style="padding-left: 5px;padding-right: 5px;">1772</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">AIC</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">1622.80</td> <td style="padding-left: 5px;padding-right: 5px;">1622.72</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">BIC</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">1650.20</td> <td style="padding-left: 5px;padding-right: 5px;">1650.12</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Log Likelihood</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">-806.40</td> <td style="padding-left: 5px;padding-right: 5px;">-806.36</td> </tr> <tr style="border-bottom: 2px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Deviance</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">1612.80</td> <td style="padding-left: 5px;padding-right: 5px;">1612.72</td> </tr> </tbody> <tfoot> <tr> <td style="font-size: 0.8em;" colspan="4"><sup>***</sup>p < 0.001; <sup>**</sup>p < 0.01; <sup>*</sup>p < 0.05</td> </tr> </tfoot> </table> --- ### Goodness of Fit Although Stata will report the pseudo `\(R^2\)` value, let's calculate it for the Logit and Probit Models. `$$\bar{R^2} = 1 - \frac{LL_{mod}}{LL_{0}}$$` -- __Logit__ * Identify the log likelihood from the full model, `\(LL_{mod}\)` * Estimate an intercept only model, and find the log likelihood, `\(LL_0\)` `$$\bar{R^2} = 1 - \frac{-806.4}{-937.6}$$` `$$\bar{R^2} = 0.14$$` -- __Probit__ * Identify the log likelihood from the full model, `\(LL_{mod}\)` * Estimate an intercept only model, and find the log likelihood, `\(LL_0\)` `$$\bar{R^2} = 0.14$$` --- ### Hypothesis Testing - Single Parameter Let's now turn to looking at the statistical significance of our coefficients (let's use the probit model for now) and test, e.g., the following hypothesis: `$$H_0: \beta_{Attend} = 0$$` -- Let's test our hypothesis at the significance level where `\(\alpha = 0.05\)` (or 95%). -- Next, let's calculate our t-test statistic a) Find the test-statistic `$$W = \frac{\hat{\beta_j} - \beta_0}{\hat{se}(\hat{\beta_j})} \sim N(0,1)$$` -- `$$W = \frac{0.54 - 0}{0.07} = 7.71$$` -- and compare to the critical value (taken from Normal, p = 0.975) `$$7.71 > 1.96$$` --- ### Hypothesis Testing - Single Parameter b) Find the p-value The p-value associated with 7.71 is: `\(<0.001\)` -- c) Find the confidence interval: `$$CI: \hat{\beta_j} \pm c*se(\hat{\beta_j})$$` -- `$$CI: 0.54 \pm 1.96*0.07$$` -- `$$CI: [0.4,.68]$$` -- Using each of these (redundant) metrics, we __reject the null hypothesis__ and find that attend is statistically significant (i.e. distinguishable from 0). --- ### Hypothesis Testing - Multiple Parameters Now let's turn to exploring a multiple parameter hypothesis. Let's now test the following null hypothesis (again, using our probit model): `$$H_0: \beta_{Attend} = \beta_{Age}= 0$$` -- Let's again test our hypothesis at the significance level where `\(\alpha = 0.05\)` (or 95%). -- We're going to now estimate our test statistic, which in this case is the likelihood ratio statistic. -- __The unrestricted model__ `$$P(Cons = 1|Age, White, Educ, Attend) = \Phi(\beta_0 + \beta_1 Age + \beta_2 White + \beta_3 Educ + \beta_4 Attend)$$` __The restricted model__ `$$P(Cons = 1|White, Educ) = \Phi(\beta_0 + \beta_2 White + \beta_3 Educ)$$` --- ### Hypothesis Testing - Multiple Parameters <table class="texreg" style="margin: 10px auto;border-collapse: collapse;border-spacing: 0px;caption-side: bottom;color: #000000;border-top: 2px solid #000000;"> <caption>Statistical models</caption> <thead> <tr> <th style="padding-left: 5px;padding-right: 5px;"> </th> <th style="padding-left: 5px;padding-right: 5px;">Unrestr.</th> <th style="padding-left: 5px;padding-right: 5px;">Restr.</th> </tr> </thead> <tbody> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Intercept</td> <td style="padding-left: 5px;padding-right: 5px;">-1.66<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">-1.14<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.23)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.20)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Age (yrs.)</td> <td style="padding-left: 5px;padding-right: 5px;">0.01<sup>**</sup></td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.00)</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">White</td> <td style="padding-left: 5px;padding-right: 5px;">0.47<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;">0.42<sup>***</sup></td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.11)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.10)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Educ (yrs.)</td> <td style="padding-left: 5px;padding-right: 5px;">-0.01</td> <td style="padding-left: 5px;padding-right: 5px;">-0.01</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> <td style="padding-left: 5px;padding-right: 5px;">(0.01)</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Attend</td> <td style="padding-left: 5px;padding-right: 5px;">0.54<sup>***</sup></td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;"> </td> <td style="padding-left: 5px;padding-right: 5px;">(0.07)</td> <td style="padding-left: 5px;padding-right: 5px;"> </td> </tr> <tr style="border-top: 1px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">AIC</td> <td style="padding-left: 5px;padding-right: 5px;">1622.72</td> <td style="padding-left: 5px;padding-right: 5px;">1699.53</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">BIC</td> <td style="padding-left: 5px;padding-right: 5px;">1650.12</td> <td style="padding-left: 5px;padding-right: 5px;">1715.98</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Log Likelihood</td> <td style="padding-left: 5px;padding-right: 5px;">-806.36</td> <td style="padding-left: 5px;padding-right: 5px;">-846.76</td> </tr> <tr> <td style="padding-left: 5px;padding-right: 5px;">Deviance</td> <td style="padding-left: 5px;padding-right: 5px;">1612.72</td> <td style="padding-left: 5px;padding-right: 5px;">1693.53</td> </tr> <tr style="border-bottom: 2px solid #000000;"> <td style="padding-left: 5px;padding-right: 5px;">Num. obs.</td> <td style="padding-left: 5px;padding-right: 5px;">1772</td> <td style="padding-left: 5px;padding-right: 5px;">1776</td> </tr> </tbody> <tfoot> <tr> <td style="font-size: 0.8em;" colspan="3"><sup>***</sup>p < 0.001; <sup>**</sup>p < 0.01; <sup>*</sup>p < 0.05</td> </tr> </tfoot> </table> --- ### Hypothesis Testing - Multiple Parameters `$$LR = 2[log(L_{unrestricted}) - log(L_{restricted})]$$` -- `$$LR = 2[-806 - -846.76]$$` -- `$$LR = 81.5$$` ... which we can explore in relation to a critical value from a `\(\chi^2\)` distribution with 2 degrees of freedom --- ### Hypothesis Testing - Multiple Parameters <img src="Week10_files/figure-html/unnamed-chunk-20-1.png" style="display: block; margin: auto;" /> --- ### Hypothesis Testing - Multiple Parameters <img src="Week10_files/figure-html/unnamed-chunk-21-1.png" style="display: block; margin: auto;" /> --- ### Predictions We can also examine various predictions for our estimated model! Let's say we want to predict the following respondent: 35 years old, non-religious White respondent with 16 years of schooling. __Logit Model__ `$$P(Cons = 1|Age, White, Educ, Attend) = \Lambda(-2.95+ 0.01 Age + 0.88 White + -0.01 Educ + 0.95 Attend)$$` -- `$$P(Cons = 1|Age = 35, White = 1, Educ = 16, Attend = 0) = \Lambda(-2.95+ 0.01 (35) + 0.88 (1) + -0.01 (16) + 0.95 (0))$$` -- `$$P(Cons = 1|Age = 35, White = 1, Educ = 16, Attend = 0) = \Lambda(-1.87)$$` -- `$$P(Cons = 1|Age = 35, White = 1, Educ = 16, Attend = 0) = 0.13$$` -- __Probit Model__ `$$P(Cons = 1|Age, White, Educ, Attend) = \Phi(-1.66+ 0.01 Age + 0.47 White + -0.01 Educ + 0.54 Attend)$$` `$$P(Cons = 1|Age = 35, White = 1, Educ = 16, Attend = 0) = \Phi(-1.66+ 0.01 (35) + 0.47 (1) + -0.01 (16) + 0.54 (0))$$` `$$P(Cons = 1|Age = 35, White = 1, Educ = 16, Attend = 0) = \Phi(-1.11)$$` `$$P(Cons = 1|Age = 35, White = 1, Educ = 16, Attend = 0) = 0.13$$` --- ### Interpreting Coefficients Finally, let's explore interpreting our coefficients (let's just use the probit model here, and aim to explore to the effect of being White on being Conservative) -- __Observed Value, Discrete Differences__ For _each_ observation in our dataset, let's calculate the predicted probability from the probit model, except assume that each observation is non-white. `$$P_{0i} = \Phi(-1.66+ 0.01 Age + 0.47 (0) + -0.01 Educ + 0.54 Attend)$$` -- Then, for _each_ observation in our dataset, let's calculate the predicted probability from the probit model, except assume that each observation is white. `$$P_{0i} = \Phi(-1.66+ 0.01 Age + 0.47 (1) + -0.01 Educ + 0.54 Attend)$$` -- This will produce hypothetical predictions for each observation in our data, and we can then explore the difference differences between these values. --- ### Interpreting Coefficients
--- ### Interpreting Coefficients <img src="Week10_files/figure-html/unnamed-chunk-23-1.png" style="display: block; margin: auto;" /> --- ## Where We're Going Next Switching gears and thinking about sampling and weighting __Reading__ * Hamilton, pgs 107--119 * Additional reading on the course website