Summary of the course

class: center, middle, inverse, title-slide

.title[
# Summary of the course
]
.subtitle[
## Tutorial 7
]
.date[
### Stanislav Avdeev
]

---

# Goal for today's tutorial

- Discuss the full course
  - Lecture 1: Binary choice models, censoring, truncation, and selection models (3-9)
  - Lecture 2: IV (10-13)
  - Lecture 3: Panel data models (14-20)
  - Lecture 4: Potential outcomes model (21-24)
  - Lecture 5: LATE and power analysis (25-27)
  - Lecture 6: DiD (28-30)
  - Lecture 7: RDD and RKD (31-34)
  
---

# Lecture 1: Binary choice models

- `$Y_i$` can only take the values `$0$` or `$1$`
`$$Y_i = \begin{cases} \mbox{} 1 \ & \mbox{} \text{with probability} ~ p_i \\ \mbox{} 0 & \mbox{}\text{with probability} ~ (1 - p_i) \end{cases}$$`
- For a binary model, the cumulative distribution function (cdf) is
`$$p_i = P (Y_i = 1 | X_i) = F(X_i' \beta)$$`
- For a binary model, the probability density function (density) is
`$$f(Y_i|X_i) = p_i^{Y_i} (1 - p_i)^{1 - Y_i}$$`
- To find `$\beta$`, use maximum likelihood function
`\begin{align*}
  L (\beta) &= \sum\nolimits^N_{i = 1} \left[Y_i \ln p_i + (1 - Y_i) \ln (1 - p_i) \right] \\
  &= \sum\nolimits^N_{i = 1} \left[Y_i \ln F(X_i' \beta) + (1 - Y_i) \ln (1 - F(X_i' \beta)) \right]
\end{align*}`
- Notice we did not specify a particular form of the cdf `$F(X_i' \beta)$`

---

# Lecture 1: Binary choice models

- Linear probability model
  - `$p_i = F(X_i' \beta) = X_i' \beta$`
  - Marginal effect: `$\frac{\partial p_i}{\partial X_{ik}} = \beta_k$`
  - There is heteroskedasticity, so use robust s.e.
  - Estimated probabilities can be outside of the bounds
- Logit
  - `$p_i = F(X_i' \beta) =  \frac{exp(X_i' \beta)}{1 + exp (X_i' \beta)}$` - the cdf of logistic distribution
  - Marginal effect: `$\frac{\partial p_i}{\partial X_{ik}} = \frac{exp(X_i \beta)}{(1+ exp(X_i \beta)^2} \beta_k$`
  - MLE is not consistent if `$F(\cdot)$` is incorrectly specified
- Probit
  - `$p_i = F(X_i' \beta)$` = `$\Phi (X_i' \beta)$` - the cdf of standard normal distribution
  - Marginal effect: `$\frac{\partial p_i}{\partial X_{ik}} = \phi(X_i \beta) \beta_k$`
  - MLE is not consistent if `$F(\cdot)$` is incorrectly specified

---

# Lecture 1: Latent structure

- Binary choice models are often written in terms of a latent structure with some latent (unobserved) variable `$$Y_i^*= X_i' \beta + U_i$$`
- The observed outcome variable is
`$$Y_i = \begin{cases} \mbox{} 1 \ & \mbox{} \text{if } ~ Y_i^* > 0 \\ \mbox{} 0 & \mbox{} \text{if } ~ Y_i^* \leq 0 \end{cases}$$`
with

`\begin{align}
  P(Y_i = 1 | X_i) &= P(Y_i^* > 0 | X_i) \\
  &= P(X_i' \beta + U_i > 0 | X_i) \\
  &= P(-U_i < X_i' \beta | X_i) \\
  &= F(X_i' \beta)
\end{align}`
where the cdf `$F(\cdot)$` is symmetric

---

# Lecture 1: Censoring and truncation

- The latent (unobserved) variable is
`$$Y_i^*= X_i' \beta + U_i$$`
- The observe outcome variable is
`$$Y_i = \begin{cases} \mbox{} Y_i^* \ & \mbox{} \text{if } ~ Y_i^* > c_i \\ \mbox{} c_i & \mbox{} \text{if } ~ Y_i^* \leq c_i \end{cases}$$`
- Censored observations are in the sample
  - for them `$Y_i = c_i$` if `$Y_i^* \leq c_i$`
- Truncated observations are not in the sample
  - for them `$Y_i$` is missing if `$Y_i^* \leq c_i$`
- Ignoring censoring and truncation leads to a biased and inconsistent estimator

---

# Lecture 1: Censoring and truncation

- To find `$\theta$`, use maximum likelihood function. Assume `$f^*(Y_i|X_i)$`  is a density function of `$Y_i^*$`, then the cdf function of `$Y_i^*$` is
`$$F^* (c_i|X_i) = P(Y_i^* < c_i |X_i) = \int^{c_i}_{-\infty} f^*(Y_i|X_i)dY_i$$`
- Censoring
  - density function: `$f(Y_i | X_i) = f^* (Y_i |X_i)^{d_i} F^*(c_i|X_i)^{1 - d_i}$` with `$d_i = 1$` for uncensored observations
  - log-likelihood function
`$$L(\theta) = \sum\nolimits_{i = 1}^N \left[d_i \ln f^* (Y_i|X_i, \theta) + (1 - d_i) \ln F^* (c_i | X_i, \theta)\right]$$`
- Truncation
  - density function: `$f(Y_i | X_i) = \frac{f^* (Y_i|X_i)}{P(Y_i^* > c_i)}  = \frac{f^* (Y_i|X_i)}{1 - F^* (c_i | X_i)}$`
  - log-likelihood function
`$$L(\theta) = \sum\nolimits_{i = 1}^N \left[ \ln f^* (Y_i|X_i, \theta) -  \ln (1 - F^* (c_i | X_i, \theta)) \right]$$`

---

# Lecture 1: Sample selection model

- The outcome variable is observed only for a selected sample
- The sample selection model has two stages
  - Selection equation
`$$I_i^* = Z_i ' \gamma + V_i$$`
- The indicator function, based on `$I_i^*$`, takes two values
`$$I_i = \begin{cases} \mbox{} 1 \ & \mbox{} \text{ if } I_i^* > 0 \\ \mbox{} 0 & \mbox{} \text{ if } I_i^* \leq 0 \end{cases}$$`
  - Regression equation
`$$Y_i^* = X_i ' \beta + U_i$$`
  - However, we observe only `$Y_i$`
`$$Y_i = \begin{cases} \mbox{} Y_i^* \ & \mbox{} \text{ if } I_i = 1 \\ \mbox{} \text{missing} & \mbox{} \text{ if } I_i = 0 \end{cases}$$`

---

# Lecture 1: Sample selection model

- To estimate the sample selection model, we make an assumption that disturbances terms are bivariate normal
`\begin{align*}
  \left[\begin{array}{l}
    U_{i} \\
    V_{i}
  \end{array}\right] \sim \mathcal{N}\left(0,\left[\begin{array}{cc}
    \sigma^{2} & \rho \sigma \\
    \rho \sigma & 1
  \end{array}\right]\right)
\end{align*}`
- Let us find expected value `$Y_i$` conditional on `$I_i = 1$`, i.e. observed `$Y_i$`
`\begin{align*}
  E[Y_i | I_i = 1,Z_i,X_i] &= E[X_i' \beta + U_i|I_i = 1,Z_i,X_i] \\
  &= X_i' \beta+ E[U_i|I_i =1,Z_i,X_i] \\
  &= X_i' \beta + E[U_i | Z_i' \gamma + V_i > 0, Z_i, X_i] \\
  &= X_i' \beta+  E[U_i| - V_i < Z_i' \gamma, Z_i, X_i] \\
  &= X'_i \beta + \rho \sigma \frac{\phi(Z_i ' \gamma)}{\Phi(Z_i' \gamma)}
\end{align*}`
- If `$\rho = 0$`, i.e. if `$U_i$` and `$V_i$` are independent or when `$X_i$` and `$Z_i$` are uncorrelated, OLS estimator is consistent 
- If `$\rho \neq 0$`, OLS estimator is inconsistent, and `$\frac{\phi(Z_i ' \gamma)}{\Phi(Z_i' \gamma)}$` is the Inverse Mills ratio which denotes selection bias

---

# Lecture 2: IV

- If `$E(U_i | X_i) \neq 0$`, there is endogeneity problem
- In this case OLS provides a biased and inconsistent `$\hat{\beta}$`
- Sources of endogeneity
  - Omitted variables
  - Reverse causality
  - Measurement error
- A solution is to use an instrument that should be
  - Relevant: `$\text{cov} (Z_i, X_i) \neq 0$`
  - Valid (exogenous): `$\text{cov} (Z_i, U_i) = 0$`
- Use two-stage least squares (IV) estimator
  - First stage
`\begin{align*}
  X_i &= \gamma_0 + \gamma_1 Z_i + V_i \\
  &\implies \hat{X_i} = \hat{\gamma_0} + \hat{\gamma_1}Z_i
\end{align*}`
  - Second stage
`\begin{align*}
  Y_i &= \beta_0 + \beta_1 \hat{X_i} + U^*_i \\
 &\implies \hat{\beta}_{1, 2SLS}
\end{align*}`

---

# Lecture 2: IV

- `$\hat{\beta}_{1, 2SLS}$` has the following form
`$$\hat{\beta}_{1,2 \mathrm{SLS}}=\frac{\sum_{i=1}^{n}\left(Z_{i}-\bar{Z}_{n}\right)\left(Y_{i}-\bar{Y}_{n}\right)}{\sum_{j=1}^{n}\left(Z_{j}-\bar{Z}_{n}\right)\left(X_{j}-\bar{X}_{n}\right)}$$`

- `$\hat{\beta}_{1, 2SLS}$` is consistent
`$$\text{plim}_{n \rightarrow \infty} \hat{\beta}_{1, 2SLS} = \frac{\text{cov}(Z_i, Y_i)}{\text{cov}(Z_i, X_i)} = \beta_1 + \frac{\text{cov}(Z_i, U_i)}{\text{cov}(Z_i, X_i)} = \beta_1$$`
- `$\hat{\beta}_{1, 2SLS}$` is biased
`$$E[\hat{\beta}_{1, 2SLS}] = \beta_1 + \sum_{i=1}^{n} \mathrm{E}\left[\frac{\frac{1}{n}\left(Z_{i}-\bar{Z}_{n}\right) U_{i}}{\frac{1}{n} \sum_{j=1}^{n}\left(Z_{j}-\bar{Z}_{n}\right)\left(X_{j}-\bar{X}_{n}\right)}\right] \neq \beta_1$$`
- Do you want to derive more consistency and unbiasedness of estimators? Take the core course Advanced Econometrics I

---

# Lecture 2: IV

- To test exogeneity of `$X_i$`, use the Hausman test
  - `$H_0$`: `$X_i$` is exogenous, i.e. OLS and 2SLS are both consistent
  - Test statistic: `$H = \frac{(\hat{\beta}_{1, 2SLS} - \hat{\beta}_{1, OLS})^2}{\text{var}(\hat{\beta}_{1, 2SLS} - \hat{\beta}_{1, OLS})} \sim \chi^2 (1)$`
  - Reject if `$H > \chi^2_\alpha (1)$`
- To test validity, use the Sargan test (over-identification required)
  - `$H_0$`: all instruments are valid
  - Find the second-stage residuals and regress them on the instruments
`$$U_i = \delta_0 + \delta_1 Z_{1, i}, + ... + \delta_M Z_{M,i} + e_i \sim \chi^2 (M - 1)$$`
  - Test statistic: `$H = nR^2$`
  - Reject if `$H > \chi^2_\alpha (M - 1)$`
- IV is consistent if instrument is relevant (F-test `$> 10$`), but bias can be large
`$$\text { Bias IV } \sim \frac{\{\# \text { instruments }\} \times \rho(U_i, V_i) \times\left(1 - R_{\text {partial }}^{2}\right)}{\{\# \text { observations }\} \times R_{\text {partial }}^{2}}$$`
where `$R_{\text {partial }}^{2}$` is the contribution of the instruments to `$R^2$` in the first-stage

---

# Lecture 2: IV

- IV is weak if `$\text{cov} (Z_i, X_i)$` is small. Recall
`$$\text{plim}_{n \rightarrow \infty} \hat{\beta}_{1, 2SLS} = \frac{\text{cov}(Z_i, Y_i)}{\text{cov}(Z_i, X_i)}$$`
- When `$\text{cov} (Z_i, X_i)$` is close to `$0$`, i.e. instrument is irrelevant, then the sampling variation in `$\text{cov} (Z_i, X_i)$` is not helpful to estimate `$\beta_{1, 2SLS}$`
- Weak instruments can be detected in the first-stage using a t-test or a F-test
  - Rule of thumb: instrument is weak if bias IV is larger than `$10\%$` of the bias of OLS
`$$\frac{\text { Bias IV }}{\text { Bias OLS }} \approx \frac{\{\# \text { instruments }\}}{\{\# \text { observations }\} \times R_{\text {partial }}^{2}}$$`
- Do you want to study more about weak IV? Take the field course Advanced Microeconometrics

---