Linear regression and statistical inference

Created by

Mabel Williams

Cards (30)

Frisch-Waugh-Lovell theorem: coefficient on X1 can be recovered in a two stage process - regress X1 on X2 to obtain the error (the part of X1 uncorrelated with X2) and then regress Y on Xtilda1(the error)
Perfect multicollinearity arises when one regression variable is perfectly explained by the other regression variable(s)
If you have m mutually exclusive and collectively exhaustive categories, include only m-1 membership dummies
Standard error of regression (SER) is the estimated standard deviation of u
SER is standardised by n-k-1 to ensure unbiasedness, large-sample properties unaffected
TSS = ESS + SSR
R^2 = ESS/TSS=1-SSR/TSS
R^2 may increase even if the added regressor is irrelevant
adjusted R^2 = 1 - (n-1)/(n-k-1) . SSR/TSS
Least Squares Assumptions: u satisfies MI/OR, Yi and Xki are i.i.d, finite fourth moments of Y and X, no perfect multicollinearity
having little variability around Beta1 as possible = efficiency
under normally distributed errors, OLS has the smallest variance amongst all unbiased estimators
Under the LSA, OLS is consistence
Homoskedastic: conditional variance of u does not depend on the regressors
there is rarely a good reason to assume homoskedasticity
t-stat under the null is a standard normal distribution
t-stat diverges under the alternative hypothesis - large values of t-stat are evidence against the null
type 1 error: reject the null when it is true
type 2 error: accept the null when it is false
size of a test = type 1 error rate
power of a test: ability to detect a false null = 1 - type 2 error rate
p-value: the smallest significance level at which we would have rejected the null, on the basis of the sample OR the probability, under the null, of obtaining a value of the test statistic 'at least as unfavourable to the null' as the test statistic calculated
95% of 95% confidence intervals will contain the true value of the parameter
use the F-statistic to test multiple hypotheses at once
Idea of the F test: estimate regression model with null imposed (restricted model) and without (unrestricted model) and measure how much worse the restricted model 'fits' the data
F = (SSR[rs]-SSR[un])/SSR[un] . n-k-1/q
Can use an F-statistic to test the null of linearity against an rth degree polynomial
Linear-log model: 'a 1% increase in X has an 0.01 x Beta1 effect on Y'
Log-linear model: 'a 1 unit increase in X increases Y by 100 x Beta1%'
Log-Log model: Beta1 is the elasticity of Y w.r.t X

See similar decks

Linear regression and statistical inference

Cards (30)

2. Statistical Inference

2.6 Linear Regression Models

9.6 Checking Conditions for Inference in Regression

2.2 Regression and Correlation

2.1 Estimation

6.1 Statistical Measures

6.1 Statistical Measures

2.1 Statistical Sampling

6.1 Statistical Measures

4.1 Linear Momentum

Unit 7: Inference for Quantitative Data: Means

6.1 Statistical Measures

Unit 6: Inference for Categorical Data: Proportions

3.2 Depression

3.2 Depression

Statistics

2.5 Statistical Hypothesis Testing

3. Linear Programming

2.5 Statistical Hypothesis Testing

2.1 Statistical Sampling

Unit 4: Linear Momentum