PSY 101

Subdecks (5)

Cards (369)

  • Regression
    Analysis using correlation to make predictions
  • In this lesson

    1. Learn how to assess the relationship between a dependent variable and one or more explanatory variables
    2. Learn how to predict a person's score on the criterion variable by a knowledge of their scores on one or more explanatory variable
    3. Learn how to use confidence limits when analyzing data by the use of multiple regression
  • Agenda
    • 10.1 An Introduction to the linear model (regression)
    • 10.2 Bias in linear models
    • 10.3 Generalizing the model
    • 10.4 Sample Size and the linear model
    • 10.5 Fitting Linear Model: The General Procedure
    • 10.6 Assumptions of regression analysis
    • 10.7 Simple linear regression
    • 10.8 Multiple regression
    • 10.9 Reporting Linear Regression
  • Explanatory and criterion variables

    • Explanatory (predictor, independent) variable
    • Criterion (outcome, dependent) variable
  • The linear model with one predictor

    • Criterion variable (Y)
    • Explanatory variable (X)
  • Linear Regression
    A method by which we fit a straight line to the data
  • Regression line
    The line of best fit
  • As x increases by 1
    y increases by 10
  • Regression Equation
    y = a + bx
  • Linear Equations

    • Y = bX + a
    • Υi = (𝛽1Xi +��0 ) + εi
  • Linear relationship between variables X and Y
    • Slope (𝛽1 or b) - gradient of the line
    • Intercept (𝛽0 or a) - The point at which the line cross the vertical axis of the graph
  • Regression Equation

    • Shows how y changes as a result of x changing
    • The steeper the slope, the more y changes as a result of x
  • What is someone's predicted score on y when their score on x = 20? Assume a = 5 and b = 2.
  • As x increases
    y decreases
  • As x increases by 1

    y decreases by 3
  • Predict the score of a person who watched 3.5 hours of TV per night. y=18 - (3x)
  • Intercept
    The point at which the line crosses the y-axis
  • Non-Perfect Relationships

    • The straight line must be drawn so that it will be as near as possible to the data point
  • How do you know the values of a and b?
  • The linear model with several predictors

    • Notice the second predictor (X2) and the associated parameter (b2)
  • What do we do in regression?

    1. Estimate the model
    2. Determine how well a line fits the data points by defining mathematically the distance between the line and each data point
  • Deviations
    The vertical distances between what the model predicted and each data point was observed
  • Residuals
    The differences between what the model predicts and the observed data
  • Residual sum of squares (SSR)

    A gauge of how well a linear model fits the data
  • Ordinary Least Squares (OLS) regression

    The line with the smallest SSR is the line of best fit
  • Standard error of estimate

    The standard distance between the predicted Y values on the regression line and the actual Y values in the data
  • SST (Total sum of squares)

    Represents how good the mean is as a model of the observed outcome scores
  • SSR (residual sum of squares)
    Can be used to calculate how much better the linear model is than the baseline model of "no relationship"
  • SSM (model sum of squares)
    If the value is large, the linear model is very different from using the mean to predict the outcome variable
  • R2
    The proportion of improvement of the model, expressed as a percentage
    1. test
    Based upon the ratio of improvement (SSM) due to the model and the error in the model (SSR)
  • Bias in Linear Models

    • Is the model influenced by a small number of cases?
    • Does the model generalize to other samples?
  • Outliers
    Cases that differ substantially from the main trend in the data
  • Standardized residuals

    Residuals converted to z-scores (mean of 0, sd of 1)
  • Studentized residuals

    Unstandardized residual divided by an estimate of its standard deviation that varies point by point
  • Adjusted predicted value

    The predicted value of the outcome for a case if it is removed/excluded
  • Deleted Residual

    The difference between the adjusted predicted value and the original observed value
  • Studentized Deleted Residual

    Deleted residual divided by standard error
  • Cook's Distance

    A measure of the overall influence of a case on the model
  • Leverage (hat values)

    Gauges the influence of the observed value of the outcome variable over the predicted values