Representations of Data

Cards (12)

  • Estimating how many students took between 36 and 45 minutes to complete their homework
    1. The number of students is directly proportional to the area under graph between 36 and 45 minutes
    2. Area: (40 - 36) x 13.6 + (45 - 40) x 3.2 = 70.4 students
  • Comparing data

    • You can comment on a measure of location and a measure of spread
    • You can use the mean and standard deviation or median and interquartile range (suitable for data sets with extreme values)
    • Median should not be used with standard deviation and mean should not be used with interquartile range
  • Drawing a boxplot and labelling the axis

    The end of the whisker is plotted at the outlier boundary since the actual figure is not known
  • Cumulative Frequency

    You can use a cumulative frequency diagram to help find estimates for the median, quartiles and percentiles in a grouped frequency table
  • Histograms
    • Group continuous data can be presented using histograms
    • Histograms show the rough location and general shape of the data, and how spread out the data is
    • The area of the bar is proportional to the frequency of each class
    • Frequency density = frequency / class width
    • Joining the middle of the top of each bar in a histogram forms a frequency polygon
  • Example 3: A random sample of 200 students was asked how long it took them to complete their homework

    • Time, t(min)
    • 25 ≤ t < 30
    • 30 ≤ t < 35
    • 35 ≤ t < 40
    • 40 ≤ t < 50
    • 50 ≤ t < 80
    • Frequency
    • 55
    • 39
    • 68
    • 32
    • 6
  • Drawing a histogram and frequency polygon to present the data

    1. Find the class width and frequency density of each class
    2. Draw the histogram using class width as the width of each bar and frequency density as the height
    3. To draw the frequency polygon, join the middle of the top of each bar of the histogram
  • Outliers
    • An outlier is commonly any value which is greater than Q3 + k(Q3 - Q1) or less than Q1 - k(Q3 - Q1)
    • Some questions have other ways of identifying the outliers. In the exam, you will be told which method to use
  • Example 1: Some data is collected. Q1 = 46 and Q3 = 68. A value greater than Q3 + k(Q3 - Q1) or less than Q1 - k(Q3 - Q1) is defined as an outlier. Work out if a)7, b)88 and c)105 are outliers. The value of k is 1.5.
    • 68 + 1.5(68 - 46) = 101
    • 46 - 1.5(68 - 46) = 13
    • 7<13 and 105>101 so 7 and 105 are outliers, 88 is not an outlier
  • Boxplots
    • A boxplot shows the quartiles, maximum and minimum values and any outliers in a data set
    • Two sets of data can be compared using boxplots
  • Example 2: The blood glucose level of 30 males is recorded. The results, in mmol/litre, are summarised below:

    • Lower quartile: 3.6
    • Upper quartile: 4.7
    • Median: 4.0
    • Lowest value: 1.4
    • Highest value: 5.2
    • An outlier is an observation that falls either 1.5x interquartile range above the upper quartile or 1.5x interquartile range below the lower quartile
  • Drawing a boxplot for the blood glucose level data
    Calculate the value of outlier: 3.6 - 1.5 x 1.1 = 1.95, 4.7 + 1.5 x 1.1 = 6.35, 1.4 < 1.95, therefore the outlier is 1.4