Cards (15)

    • What is the line of best fit defined by?
      y = ax + b
    • What are the fitted values?
      The prediction for y based on the observed value of x
    • What are the residuals?
      The vertical difference from the regression line to the recorded data point
    • When can the outcome be misleading?
      When data doesn't have a linear relationship or contains anomalies
    • When is a line of best fit not useful?
      Typically, the more complex a graph is, the less useful a line of best fit is.
    • What are the 5 steps in linear regression
      1. Plot the data
      2. Consider assumptions
      3. Fit the regression
      4. Diagnostic plots
      5. Plot the results with uncertainties
    • Step 1 in more detail
      When considering regression studies, it's important to consider third variables that may impact bout explanatory and response variables.
    • Step 2 in more detail
      Linearity of expected value, constant variance, independence, normally distributed residuals
    • What is the correlation coefficient?
      R, it is between -1 and 1
      1 means a perfect correlation with a positive gradient and -1 is a negative gradient
    • If y = 2x, what is the correlation between x and y?
      1
    • if y = -0.1, what is the correlation between x and y?
      -1
    • if R=0 what does that mean?
      This means that there is no correlation and no linear relationship between x and y
    • what does R^2 mean?
      This is the fraction of variance explained. If R^2 = 0.8, then 80% of the variance in y is explained by variance in x
    • When is one way analysis of variance (ANOVA) useful?
      When two or more levels in categorical explanatory variables.
      When there are two categories, ANOVA contains an f-test analogous to the two sample t-test (it is not identical).
    • What are the assumptions we make in ANOVA analysis?
      The validity of our data depends on the assumptions we make about our data
      • normally distributed residuals
      • independent errors (if errors are correlated)
      • random sampling - closely related to independent errors
      • homogeneity if variance (i.e residuals could all be sampled from same normal distribution)
    See similar decks