Data analysis: Descriptive statistics

Cards (11)

  • Descriptive statistics are used to identify trends and analyse sets of data.
    • They do not allow conclusions about cause and effect, but give an overview of what the data looks like.
  • Measures of central tendency are 'averages' which describe the typical value in a data set. These include the mean, median and mode.
  • Mean (average)
    • Add up all the values and divide by the number of values.
    • βœ… Advantage: Uses all values in the data set β†’ sensitive and more representative of the data.
    • ❌ Disadvantage: Affected by extreme values (outliers).
  • Median
    • The middle value when scores are ranked in order.
    • For even numbers, take the average of the middle two.
    • βœ… Not affected by extreme scores.
    • ❌ Doesn’t consider all values β†’ less sensitive and representative of the data.
  • Mode
    • The most frequent value in a set.
    • βœ… Can be used with nominal data.
    • ❌ May not be representative, especially if multiple modes (bimodal/multimodal) or no mode.
  • Measures of dispersion measure the spread of data. These include the range and standard deviation.
  • Range
    • Highest – Lowest value
    • Often presented with +1 to account for rounding errors.
    • βœ… Easy to calculate.
    • ❌ Affected by extreme values β†’ so may be unrepresentative of the data.
  • Standard Deviation (SD)
    • Shows how far scores deviate from the mean.
    • Larger SD = more spread out.
    • βœ… More precise than the range, uses all data.
    • ❌ More complex to calculate, influenced by outliers.
  • If there are no outliers, the best measure of central tendency to use is the mean, and the best measure of dispersion is the standard deviation.
  • If there are outliers present in the data, the best measure of central tendency is the median, and the best measure of dispersion is the range.
  • If the data is nominal, the best measure of central tendency to use is the mode.