Representations of Data

Created by

Daevonn Oladipo

Cards (12)

Estimating how many students took between 36 and 45 minutes to complete their homework
1. The number of students is directly proportional to the area under graph between 36 and 45 minutes
2. Area: (40 - 36) x 13.6 + (45 - 40) x 3.2 = 70.4 students
View source
Comparing data 
You can comment on a measure of location and a measure of spread
You can use the mean and standard deviation or median and interquartile range (suitable for data sets with extreme values)
Median should not be used with standard deviation and mean should not be used with interquartile range
View source
Drawing a boxplot and labelling the axis 
The end of the whisker is plotted at the outlier boundary since the actual figure is not known
View source
Cumulative Frequency 
You can use a cumulative frequency diagram to help find estimates for the median, quartiles and percentiles in a grouped frequency table
View source
Histograms 
Group continuous data can be presented using histograms
Histograms show the rough location and general shape of the data, and how spread out the data is
The area of the bar is proportional to the frequency of each class
Frequency density = frequency / class width
Joining the middle of the top of each bar in a histogram forms a frequency polygon
View source
Example 3: A random sample of 200 students was asked how long it took them to complete their homework 
Time, t(min)
25 ≤ t < 30
30 ≤ t < 35
35 ≤ t < 40
40 ≤ t < 50
50 ≤ t < 80
Frequency
55
39
68
32
6
View source
Drawing a histogram and frequency polygon to present the data 
1. Find the class width and frequency density of each class
2. Draw the histogram using class width as the width of each bar and frequency density as the height
3. To draw the frequency polygon, join the middle of the top of each bar of the histogram
View source
Outliers 
An outlier is commonly any value which is greater than Q3 + k(Q3 - Q1) or less than Q1 - k(Q3 - Q1)
Some questions have other ways of identifying the outliers. In the exam, you will be told which method to use
View source
Example 1: Some data is collected. Q1 = 46 and Q3 = 68. A value greater than Q3 + k(Q3 - Q1) or less than Q1 - k(Q3 - Q1) is defined as an outlier. Work out if a)7, b)88 and c)105 are outliers. The value of k is 1.5. 
68 + 1.5(68 - 46) = 101
46 - 1.5(68 - 46) = 13
7<13 and 105>101 so 7 and 105 are outliers, 88 is not an outlier
View source
Boxplots 
A boxplot shows the quartiles, maximum and minimum values and any outliers in a data set
Two sets of data can be compared using boxplots
View source
Example 2: The blood glucose level of 30 males is recorded. The results, in mmol/litre, are summarised below: 
Lower quartile: 3.6
Upper quartile: 4.7
Median: 4.0
Lowest value: 1.4
Highest value: 5.2
An outlier is an observation that falls either 1.5x interquartile range above the upper quartile or 1.5x interquartile range below the lower quartile
View source
Drawing a boxplot for the blood glucose level data
Calculate the value of outlier: 3.6 - 1.5 x 1.1 = 1.95, 4.7 + 1.5 x 1.1 = 6.35, 1.4 < 1.95, therefore the outlier is 1.4
View source