Regression

Created by

Norhafida

Cards (79)

Regression 
Analysis using correlation to make predictions
View source
In this lesson 
1. Learn how to assess the relationship between a dependent variable and one or more explanatory variables
2. Learn how to predict a person's score on the criterion variable by a knowledge of their scores on one or more explanatory variable
3. Learn how to use confidence limits when analyzing data by the use of multiple regression
View source
Agenda 
10.1 An Introduction to the linear model (regression)
10.2 Bias in linear models
10.3 Generalizing the model
10.4 Sample Size and the linear model
10.5 Fitting Linear Model: The General Procedure
10.6 Assumptions of regression analysis
10.7 Simple linear regression
10.8 Multiple regression
10.9 Reporting Linear Regression
View source
Explanatory and criterion variables 
Explanatory (predictor, independent) variable
Criterion (outcome, dependent) variable
View source
The linear model with one predictor 
Criterion variable (Y)
Explanatory variable (X)
View source
Linear Regression 
A method by which we fit a straight line to the data
View source
Regression line 
The line of best fit
View source
As x increases by 1 
y increases by 10
View source
Regression Equation 
y = a + bx
View source
Linear Equations 
Y = bX + a
Υi = (𝛽1Xi +𝛽0 ) + εi
View source
Linear relationship between X and Y
Slope (𝛽1 or b) - gradient of the line
Intercept (𝛽0 or a) - The point at which the line cross the vertical axis of the graph
View source
Regression Equation 
Shows how y changes as a result of x changing
The steeper the slope, the more y changes as a result of x
View source
What is someone's predicted score on y when their score on x = 20? Assume a = 5 and b = 2.
View source
As x increases 
y decreases
View source
As x increases by 1 
y decreases by 3
View source
Predict the score of a person who watched 3.5 hours of TV per night. y=18 - (3x)
View source
For every value of x
y increases by 5
View source
Non-Perfect Relationships 
Draw the line in the best place possible: the place where the maximum number of dots will be nearest the line → best fit
View source
How do you know the values of a and b?
View source
The linear model with several predictors 
Notice the second predictor (X2) and the associated parameter (b2))
View source
What do we do in regression? 
1. We estimate the model
2. To determine how well a line fits the data points, the first step is to define mathematically the distance between the line and each data point
3. We could assess the fit of a model by looking at the deviations between the model and data collected
View source
Residuals 
The differences between what the model predicts and the observed data. The differences between the actual scores and predicted scores.
View source
Residual sum of squares (SSR)
A gauge of how well a linear model fits the data
View source
Estimating the model: Methods of Least Squares 
1. The best-fitting line is the one that has the smallest total squared error
2. This line is called least-squared-error solution
3. For each value of X in the data, this equation determines the point on the line that gives the best prediction of Y
View source
Standard error of estimate 
The standard distance between the predicted Y values on the regression line and the actual Y values in the data
View source
Assessing the goodness of fit, sum of squares, R and R2
1. SSR tells us how much error there is in a model, but it does not tell us whether using the model is better than nothing
2. We need to compare the model against a baseline to see whether it "improves" how well we can predict the outcome
3. We fit the baseline model, using the mean of the outcome
4. Then we fit the best model, and calculate the error, SSR
5. If the model is good, it should have significantly less error within that baseline model
View source
SST (Total sum of squares) 
Represents how good the mean is as a model of the observed outcome scores
View source
SSR (residual sum of squares)
Can be used to calculate how much better the linear model is than the baseline model of "no relationship"
View source
SSM (model sum of squares)
If the value is large, the linear model is very different from the using the mean to predict the outcome variable
View source
R2 
The proportion of improvement of the model, expressed as a percentage
View source
test 
Based upon the ratio of improvement (SSM) due to the model and the error in the model (SSR)
View source
Bias in Linear Models 
Is the model influenced by a small number of cases?
Does the model generalize to other samples?
View source
Outliers 
Cases that differ substantially from the main trend in the data
View source
Standardized residuals 
Residuals converted to z-scores (mean of 0, sd of 1)
View source
Studentized residuals 
Unstandardized residual divided by an estimate of its standard deviation that varies point by point
View source
Adjusted predicted value 
The predicted value of the outcome for that case from a model if the case is removed/excluded
View source
Deleted Residual 
The difference between the adjusted predicted value and the original observed value
View source
Studentized Deleted Residual 
Deleted residual divided by standard error
View source
Cook's Distance 
A measure of the overall influence of a case on the model
View source
Leverage (hat values) 
Gauges the influence of the observed value of the outcome variable over the predicted values
View source