Measures of central tendency and data distribution

Created by

Hiri P

Cards (11)

The point of any average is to present the most typical score within a set of data
It enables us to:
Describe/ summarise a data set
establish norms for a sample2
compare data sets
3 key measures:
MEAN
MODE
MEDIAN
The (Arithmetic) Mean:
The sum of all the scores in the sample / The total number of scores in the sample
The Median:
The middle value in the sample
If there are an even number of data-points then take the middle of those
The Mode:
The most commonly occurring value in the sample
Distribution of Data:
A certain number of assumptions need to be met in order to use certain descriptive statistics and statistical tests
How the data are distributed is one important aspect
Need to know how data are distributed in order to decide:
how to describe the data (i.e. descriptive statistics)
what statistical tests to use (i.e. inferential statistics)
Statistical variation and data distribution:
For continuous data (i.e. interval or ratio data) you can assess the distribution within a sample by plotting a histogram
Expect most people to cluster around central value
Expect the numbers to decrease in more or less a symmetrical manner in each direction
The Normal Curve:
If you plotted a histogram of a variable for the whole population it would look like a bell shaped curve
Smooth, symmetrical bell shaped curve indicates that data are normally distributed
Larger the sample the smoother the bell shaped curve
Central Limit Theorem
given a *sufficiently large sample size from a population, the mean of all samples from the same population will be approximately equal to the mean of the population
*sufficiently large random samples from the population
Often our samples are neither large nor randomly chosen
So there is usually a fair risk that our data are not normally distributed
Skewed Distribution:
Mean can be distorted by extreme values, particularly where samples are small
e.g. very low values or very high values
Results in a skewed distribution
Outliers:
An observation point that is distant to other observations
May be due to variability within the sample
May be due to error
May result in a skewed distribution
May be real
Skewed results
A) negative
B) positive