Sample statistics are estimates of population parameters
Samples are limited to measuring characteristics of a portion of the population
Inferential statistics
Used to make convincing statements & confident conclusions based on results
Statistical tests calculate the probability (p) that the observed effects/relationships happened by random chance
Probability = effects/relationships due to natural variability
Probability < effects/relationship enough to be real
Mean: Sum of all observations divided by the number of observations
Median: The middle measurement in an ordered set of data, less sensitive to extreme values than mean - more robust measure in some cases
Applications of statistics
Advertisingcampaigns
Trends
Evaluating crime patterns
GDP
Descriptive statistics
Used to summarise data
Measure of central tendency: Mean, median, mode
Measure of spread or dispersion: Standard deviation, variance, standard error, range
Help describe a sample from a population
Population parameters describe characteristics of a statistical population
As sample size (n) increases, sample statistics become more accurate - estimates of the true population
Mean is sensitive to extreme values in a population
Mode can also be used as a measure of central tendency for nominal and ordinal data
Sample variance (s2)
Based on the sum of squares (SS), which is the sum of the squared deviations of each observation from the sample mean
Population variance (σ2) is the mean of SS divided by the mean square, where N is the population size
SD is the most commonly reported measure of variability for the mean when describing a sample, reported as mean ± SD
Median is the middle measurement in an ordered set of data, having an equal number of observations on either side and is less sensitive to extreme values than mean
Mode
Most common value in a dataset, not sensitive to extreme values
When reporting the mean, one must also report one of these measures of variability: Sample Variance (s2), Standard deviation of the sample (SD), Standard error of the sample mean (se), 95% confidence limits
Sample variance (s2) is the mean of SS divided by the mean square
Sample Standard deviation (SD) is the average deviation of observations from the sample mean and has the same units as the original measurements
Standard error (SE) calculation
se = SD/√n
Coefficient of Variation (V)
Used to compare variability (SD) between samples that have different means
SD is the most commonly reported measure of variability for the mean when describing a sample
Degrees of Freedom (d.f.) often = n-1 when the sample variance is calculated
Range is the difference between the smallest and largest value observed
SD
Square root of the variance
Measures of variability for the median
Range
Interquartile range
Interquartile Range is the range of values in the 2nd and 3rd quartiles (middle 50% of the values)
Standard error (SE) is a measure of the precision or uncertainty around the estimate of the population mean
95% Confidence limits of the Mean are a measure of the precision of the estimate of the mean
Interquartile range calculation
1. Lowest 25% of the values = 1st quartile
2. Next 25% = 2nd quartile
3. Next 25% = 3rd quartile
4. Last 25% = 4th quartile
Interquartile range
Range of values in the 2nd and 3rd quartiles (middle 50% of the values)
If a firm increases advertising, their demand curve shifts right, increasing the equilibrium price and quantity
Frequency or number of each observed value/category in a dataset is used to estimate the probability of events happening
Mean, Median & Mode are similar in value or approach the same value in large datasets
If you add up marginal utility for each unit, you get total utility
Marginal utility
Additional utility (satisfaction) gained from the consumption of an additional product
Median
Middle-most value & forms the boundary between 2nd and 3rd quartiles
Frequency distributions of ratio/interval data often take the shape of a normal distribution, approximating to a symmetrical and bell-shaped curve