Measures of dispersion

Cards (7)

  • The range
    • Definition: the spread of scores between the highest + lowest value
    • What it means: tells us how spread out/dispersed our data is
    • How to calculate: subtract the lowest score from the highest score and add one
    • How to interpret the result: large range = data is very spread out, small range = data is very close together
  • Variance
    • definition = the amount a score is spread from the mean
    • tells us about the spread of scores around the mean (because the variance calculates the average distance between each score in the data set and the mean
    • small variance = scores are similar and close to the mean, large variance = scores are very different and far from the mean
  • How to calculate the variance (population variance + sample variance)

    1. calculate the mean
    2. subtract the mean from each number in your sample
    3. square the results of each of these calculations
    4. add the squared numbers together
    5. divide the sum of the squares by number of Ps (or number of Ps - 1 for sample variance)
  • Standard deviation
    • definition = the average amount a score is spread from the mean
    • square root of the variance so tells us the average amount a number differs from the mean
    • calculate by square rooting the variance
    • small SD = scores are very similar + consistent
    • large SD = scores are very different + inconsistent
  • Strength of variance and standard deviation
    • both take all scores into account (unlike the range) = so are more precise and representative measure of dispersion
    • SD only: returns the units to the same figure as the mean = making it easier to make direct judgements about data sets
  • Weakness of variance + standard deviation
    • may hide some of the characteristics of the data (especially if there are outliers) which could skew the data = because the calculation is based on the mean it isn't useful when data is not normally distributed
    • SD only: because the calculation is based on the mean it's not useful when data are not normally distributed, has outliers + is quite difficult to calculate
  • Why is SD better than the variance (incl context)

    P = reverts the value to a similar figure as the data/mean
    E = because we are square rooting variants our SD becomes a smaller value
    C = this is a strength because we are able to make more direct comparisons between the data itself, for example see how many Ps are 1 SD away from the mean to see how consistent data is