Regression Analysis

Created by

xena

Cards (72)

Regression 
Analysis using correlation to make predictions
View source
Explanatory and criterion variables
Explanatory (predictor, independent) variable
Criterion (outcome, dependent) variable
View source
The linear model with one predictor 
Criterion variable (Y)
Explanatory variable (X)
View source
Linear Regression 
A method by which we fit a straight line to the data
View source
Regression line 
The line of best fit
View source
As x increases by 1 
y increases by 10
View source
Regression Equation 
y = a + bx
View source
Linear Equations 
Y = bX + a
Υi = (𝛽1Xi +𝛽0 ) + εi
View source
Linear relationship between X and Y
Slope (𝛽1 or b) - gradient of the line
Intercept (𝛽0 or a) - The point at which the line cross the vertical axis of the graph
View source
Regression Equation 
Shows how y changes as a result of x changing
The steeper the slope, the more y changes as a result of x
View source
As x increases 
y decreases
View source
Intercept 
The point at which the line crosses the y-axis
View source
Which regression line gives the better prediction?
View source
The linear model with several predictors 
Second predictor (X2) and the associated parameter (b2)
View source
What do we do in regression?
1. Estimate the model
2. Determine how well a line fits the data points by defining the distance between the line and each data point
View source
Deviations 
The vertical distances between what the model predicted and each data point was observed
View source
Residuals 
The differences between what the model predicts and the observed data
View source
Residual sum of squares (SSR) 
A gauge of how well a linear model fits the data
View source
Estimating the model: Methods of Least Squares 
The best-fitting line is the one that has the smallest total squared error
View source
Standard error of estimate 
The standard distance between the predicted Y values on the regression line and the actual Y values in the data
View source
SST (Total sum of squares) 
Represents how good the mean is as a model of the observed outcome scores
View source
SSR (residual sum of squares)
Can be used to calculate how much better the linear model is than the baseline model of "no relationship"
View source
SSM (model sum of squares)
If the value is large, the linear model is very different from using the mean to predict the outcome variable
View source
R2 
The proportion of improvement of the model, expressed as a percentage
View source
test 
Based upon the ratio of improvement (SSM) due to the model and the error in the model (SSR)
View source
Outliers 
Cases that differ substantially from the main trend in the data
View source
Standardized residuals 
Residuals converted to z-scores (mean of 0, sd of 1)
View source
Studentized residuals 
Unstandardized residual divided by an estimate of its standard deviation that varies point by point
View source
Adjusted predicted value 
The predicted value of the outcome for a case if it is removed/excluded
View source
Deleted Residual 
The difference between the adjusted predicted value and the original observed value
View source
Studentized Deleted Residual 
Deleted residual divided by standard error
View source
Cook's Distance 
A measure of the overall influence of a case on the model
View source
Leverage (hat values) 
Gauges the influence of the observed value of the outcome variable over the predicted values
View source
Mahalanobis Distance 
Measures the distance of cases from the mean(s) of the predictor variable(s)
View source
Studentized Deleted Residual 
A measure of the overall influence of a case on the model
View source
Cook's Distance 
Gauges the influence of the observed value of the outcome variable over the predicted values
View source
Leverage (hat values) 
Measure the distance of cases from the mean(s) of the predictor variable(s)
View source
Leverage (hat values) 
If there are no influential cases, all leverage values should be equal to the average value
Investigate cases with values greater than twice or three times the average
View source
Mahalanobis Distance 
Measures the distance of cases from the mean(s) of the predictor variable(s), they have a chi-square distribution
View source
Mahalanobis Distance 
Cut-off points are established by looking for the critical value for the desired alpha level
For larger samples (e.g. n=500) with 5 predictors, values >25 → major concern
For smaller samples (e.g. N=100) and fewer predictors (e.g. 3), values > 15 are problematic
For very small samples (e.g. N=30) with 2 predictors, values > 11 should be examined
View source

See similar decks

Regression Analysis

Cards (72)

3.2 Depression

2.2 Regression and Correlation

4.3 Correlation and Regression

3.2 Depression

3.2 Depression

1.1.3 Analysis

1.6 Analysis

7.5 Data Analysis

3.5.2 Ratio Analysis

9.2 Data Analysis

3.3 Data Analysis

3.2 Data Analysis

2.1. Textual Analysis

1.1.3 Analysis

2.8 Least Squares Regression

3. Beliefs about the Ultimate Reality

2.6 Linear Regression Models

2.4.2 Depression

2.1.3 Dimensional Analysis

1.6.1 Qualitative Analysis

2.2.3 Graphical Analysis