Save
the ultimate psy101 reviewer (not rlly Bismillah)
Regression Analysis
Save
Share
Learn
Content
Leaderboard
Share
Learn
Created by
xena
Visit profile
Cards (72)
Regression
Analysis using
correlation
to make
predictions
View source
Explanatory and criterion variables
Explanatory
(predictor,
independent
) variable
Criterion
(outcome,
dependent
) variable
View source
The
linear model with one predictor
Criterion
variable (Y)
Explanatory
variable (X)
View source
Linear Regression
A method by which we fit a straight
line
to the data
View source
Regression line
The line of
best
fit
View source
As
x
increases by 1
y increases by 10
View source
Regression
Equation
y = a + bx
View source
Linear Equations
Y =
bX
+ a
Υi
= (𝛽1Xi +𝛽0 )
+
εi
View source
Linear relationship between X and Y
Slope
(𝛽1 or b) -
gradient
of the line
Intercept
(𝛽0 or a) - The point at which the line cross the
vertical
axis of the graph
View source
Regression
Equation
Shows how y changes as a result of
x
changing
The steeper the slope, the
more
y changes as a result of
x
View source
As
x
increases
y
decreases
View source
Intercept
The point at which the
line
crosses the
y-axis
View source
Which
regression line
gives the better prediction?
View source
The
linear model with several predictors
Second
predictor
(X2) and the associated
parameter
(b2)
View source
What do we do in regression?
1.
Estimate
the model
2. Determine how well a
line
fits the data points by defining the
distance
between the line and each data point
View source
Deviations
The vertical distances between what the model predicted and each data point was observed
View source
Residuals
The differences between what the model predicts and the
observed
data
View source
Residual
sum of squares (SSR)
A gauge of how well a
linear
model fits the data
View source
Estimating
the model: Methods of
Least Squares
The best-fitting line is the one that has the
smallest
total
squared
error
View source
Standard
error of estimate
The standard distance between the predicted Y values on the
regression line
and the
actual
Y values in the data
View source
SST
(Total sum of squares)
Represents how good the
mean
is as a model of the
observed
outcome scores
View source
SSR (residual sum of squares)
Can be used to calculate how much better the
linear
model is than the
baseline
model of "no relationship"
View source
SSM (model sum of squares)
If the value is
large
, the
linear
model is very different from using the mean to predict the outcome variable
View source
R2
The
proportion
of improvement of the model, expressed as a
percentage
View source
test
Based upon the
ratio
of
improvement
(SSM) due to the model and the error in the model (SSR)
View source
Outliers
Cases that differ substantially from the
main
trend in the data
View source
Standardized
residuals
Residuals converted to
z-scores
(mean of
0
, sd of 1)
View source
Studentized residuals
Unstandardized
residual divided by an
estimate
of its standard deviation that varies point by point
View source
Adjusted
predicted value
The predicted value of the outcome for a case if it is removed/
excluded
View source
Deleted
Residual
The difference between the
adjusted
predicted value and the original
observed
value
View source
Studentized
Deleted Residual
Deleted
residual
divided by
standard
error
View source
Cook
's Distance
A measure of the
overall influence
of a case on the model
View source
Leverage
(hat values)
Gauges
the
influence
of the observed value of the outcome variable over the predicted values
View source
Mahalanobis Distance
Measures the distance of cases from the
mean
(s) of the
predictor
variable(s)
View source
Studentized
Deleted Residual
A measure of the
overall
influence of a case on the model
View source
Cook
's Distance
Gauges
the
influence
of the observed value of the outcome variable over the predicted values
View source
Leverage (hat values)
Measure the
distance
of cases from the mean(s) of the
predictor
variable(s)
View source
Leverage
(hat values)
If there are no influential cases, all
leverage
values should be equal to the
average
value
Investigate cases with values greater than
twice
or
three
times the average
View source
Mahalanobis
Distance
Measures the distance of cases from the
mean
(s) of the predictor variable(s), they have a
chi-square
distribution
View source
Mahalanobis Distance
Cut-off
points are established by looking for the critical value for the desired alpha level
For larger samples (e.g. n=500) with 5 predictors, values >25 → major concern
For smaller samples (e.g. N=100) and fewer predictors (e.g. 3), values >
15
are problematic
For very small samples (e.g. N=30) with 2 predictors, values >
11
should be examined
View source
See all 72 cards