we can always use regression to answer descriptive questions, but only sometimes to answer causal questions
Orthogonality: cov(X,u)=0 or E(Xu)=0 - all other determinants of Y should be uncorrelated with X - OR means this plus Eu=0
If orthogonality holds, the population regression coefficients identify the coefficients of the causal model
If orthogonality holds, the OLS estimate for Beta1/Beta2 converges in probability to the population coefficient (is consistent) so causal interpretations based on OLS estimates are valid
If orthogonality not satisfied in the causal model, population regression coefficient no longer agrees with causal model coefficient - OLS estimates only have a descriptive interpretation in terms of rho1
If OR fails, we can say 'on average, a unit increase in X is associated with a rho1 increase in Y'
Causation can only come from a causal model - from economic theory
Mean independence implies orthogonality
Mean independence is needed only for unbiasedness of Beta1hat, orthogonality only is needed for consistency
Multiple regression: if OR fails, rho1 can only be interpreted descriptively 'on average, a unit increase in X1 is associated with a rho1 increase in Y, holding each of X2,...,Xk constant'
A good proxy is a 'close correlate' or 'good predictor' of an unobserved determinant of Y