Any data set can be characterized by measuring its central tendency
Measure of central tendency
A single value that represents a data set, to locate the center of a data set
The most common measures of central tendency are the mean, median and mode
Mean
The arithmetic average, the sum of the data divided by the number of observations, the mathematical center of the distribution
Mean
A set of data has only one mean
Mean can be applied for interval and ratio data
All values in the data set are included in computed the mean
The mean is very useful in comparing two or more data sets
Mean is affected by the extreme small or large values on a data set
The mean cannot be computed for the data in a frequency distribution with an open-ended class
Mean is most appropriate in symmetrical data
Median
The midpoint of the data array, divides the data into two equal parts
Median
The median is unique
The median is found by arranging the set of data from lowest or highest and getting the value of the middle observation
Median is not affected by the extreme small or large values
Median can be computed for an open ended frequency distribution
Median can be applied for ordinal, interval and ratio data
Median is most appropriate in a skewed data
Mode
The value in a data set that appears most frequently
Mode
The mode is found by locating the most frequently occurring value
The mode is the easiest average to compute
There can be more than one mode or even no mode in any given data set
Mode is not affected by the extreme small or large values
Mode
Extreme values in a data set do not affect the mode
A data may not contain any mode if none of the values is "most typical"
Properties of mode
The mode is found by locating the most frequently occurring value
The mode is the easiest average to compute
There can be more than one mode or even no mode in any given data set
Mode is not affected by the extreme small or large values
Mode can be applied for nominal, ordinal, interval and ratio data
Midrange
The average of the lowest and highest value in a data set
Properties of midrange
is easy to compute
give the midpoint
is unique
is affected by the extreme small or large values
can be applied for interval and ratio data
The midrange is greatly influenced by extreme or outlying values
Any data set can be characterized by measuring its central tendency
Measure of central tendency
A single value that represents a data set, to locate the center of a data set
The most common measures of central tendency are the mean, median and mode
Mean
The arithmetic average, the sum of the data divided by the number of observations, the mathematical center of the distribution
Mean
A set of data has only one mean
Can be applied for interval and ratio data
All values in the data set are included in computed the mean
Very useful in comparing two or more data sets
Affected by the extreme small or large values on a data set
Cannot be computed for the data in a frequency distribution with an open-ended class
Most appropriate in symmetrical data
Median
The midpoint of the data array, divides the data into two equal parts
Median
Unique, only one median for a set of data
Found by arranging the set of data from lowest or highest and getting the value of the middle observation
Not affected by the extreme small or large values
Can be computed for an open ended frequency distribution
Can be applied for ordinal, interval and ratio data
Most appropriate in a skewed data
Mode
The value in a data set that appears most frequently
Mode
Found by locating the most frequently occurring value
Easiest average to compute
Can have more than one mode or even no mode
Not affected by the extreme small or large values
Mode
Extreme values in a data set do not affect the mode
A data may not contain any mode if none of the values is "most typical"
Properties of mode
The mode is found by locating the most frequently occurring value
The mode is the easiest average to compute
There can be more than one mode or even no mode in any given data set
Mode is not affected by the extreme small or large values
Mode can be applied for nominal, ordinal, interval and ratio data
Midrange
The average of the lowest and highest value in a data set
Properties of midrange
The midrange is easy to compute
The midrange give the midpoint
The midrange is unique
Midrange is affected by the extreme small or large values
Midrange can be applied for interval and ratio data
The midrange is greatly influenced by extreme or outlying values
Types of Distribution
Symmetrical Distribution
Positively Skewed Distribution or Right-Skewed Distribution
Negatively Skewed Distribution or Left-Skewed Distribution
Coefficient of Variation (CV)
The standard deviation divided by the mean, expressed as a percentage
Kurtosis is a statistical measure used to describe the distribution of observed data around the mean. It measures the peakedness or flatness of a distribution compared to the normal distribution.
Comparing the variations of commissions and sales
The coefficient of variation is larger for sales, so sales are more variable than commissions
Kurtosis
A statistical measure used to describe the distribution of observed data around the mean, measuring the relative peakedness or flatness of a distribution
Three types of kurtosis
Leptokurtic (positive kurtosis, high degree of peakedness)
Mesokurtic (kurtosis of zero, intermediate distribution)
Platykurtic (negative kurtosis, low degree of peakedness)
Simple event
An event that includes one and only one of the outcomes
Compound event
A collection of more one outcome for an experiment
Normal distribution
Continuous probability distribution that describes data that clusters around a mean
Normal distribution
Graph of the associated probability density function is bell-shaped, with a peak at the mean
Known as the Gaussian function or bell curve
Normal curve developed mathematically by Abraham de Moivre
1733
Pierre-Simon Laplace used the normal curve to describe the distribution of errors