Endogeneity

Cards (30)

  • If OR holds, we say X1 and X2 are exogenous
  • If OR fails, we say at least one of the variables is endogenous
  • causes: omitted variable, measurement error, simultaneity/reverse causality
  • solutions to endogeneity: try to say something definite about the likely biases of OLS, include proxies for omitted variables, use RCTs/natural experiments, instrumental variables
  • No omitted variable bias if either the variables are uncorrelated or one of the variables is irrelevant to the determination of Y
  • Proxying for omitted variables typically motivates the inclusion of demographic variables like family characteristics, and race and region dummies etc.
  • proxies cannot be given a causal interpretation as they are not included in the causal model
  • the coefficient on a mismeasured regressor is subject to attenuation bias (biased towards 0), coefficients on other regressors may be biased in either direction
  • RCT characterised by: a sample of individuals under study who have elected/been compelled to participate, assignment of Xi being under the control of the researcher, for the individuals in that sample
  • actual treatment equals assigned treatment if there is perfect compliance
  • RCTs make X and u independent by construction
  • Limitation of RCTs: needs to be feasible and ethical, so only possible for some kinds of 'treatment'
  • population linear regression coefficient is the difference of population group means, if treatment is binary
  • control variables used for RCTs must be pre-treatment characteristics
  • can test for balancedness: the treated and the untreated should have approximately similar pre-treatment characteristics
  • a shortcoming of average treatment effect (ATE) is that it averages effect of treatment over people who may never receive it in reality
  • internal validity: are inferences on causal effects credible for the population studied
  • external validity: can inferences be credibly generalised to other populations?
  • threats to internal validity: imperfect compliance, small samples, attrition (people may drop out), Hawthorne effects
  • problems for external validity arise when: populations differ in a way that matters for the determination of Y and which is not accounted for by the model
  • a quasi-/natural experiment is an observational study where 'nature' partly replicates an RCT - X is 'as if' randomly assigned
  • instrumental variables must satisfy: 1. Z is correlated with X 2. Z is uncorrelated with u (is exogenous) 3. Z does not enter the structural equation
  • exogeneity and exclusion of IVs mean changes in Z do not affect Y directly, relevance means a change in Z affects Y by 'shifting' X
  • source of endogeneity in X is not important for IVs, as long as we have a Z that fulfils the criteria it can be used as an IV
  • under homoskedasticity 2SLS if less efficient than OLS
  • weak instruments arises when the relevance condition may technically hold, but with a coefficient too small relative to the sample size for normal approximation to be reliable
  • Can use F test to detect weak instruments with a larger rule of thumb value of c=10
  • If you have weak instruments, standard inferences cannot be drawn for Beta1 and usual confidence intervals may be misleadingly narrow
  • if random assignment fails, can randomly assign an inducement - if the inducement affects uptake of treatments and is independent of individual characteristics it is an instrumental variable
  • Local average treatment effect (LATE) = a population average of individual-level treatment effects, weighted by responsiveness to the instrument