1. Fundamentals of statistical testing

Created by

Esme Aeschlimann

Cards (20)

Mean 
Sum of all numbers in a set, divided by the number of numbers
Standard deviation (SD)
Spread of the data around the mean. Average difference from the mean
x = the value from the data
x w/ the line = the mean
n = number of values
E symbol = sum of all (the calculations w each of the values)
Greek, Latin and hat symbols
Greek = populations
Latin = samples
Hat = population estimates
Known distributions
Some shapes are 'algebraically tractable' (i.e. there is a maths formula to draw the line). Y axis = density (worked out by the pre-set formula)
Normal distribution 
Continuous, unimodal (only one peak in distribution, one mode), symmetrical and bell-shaped. Normal distribution has fixed proportions and a function of two parameters (mean and SD). Key to be noted that not every bell-curved and symmetrical distribution is a normal distribution
Chi-square distribution
t distribution
Beta distribution
Uniform distribution
Area below the normal curve
approx. 68% is within ±1sd from the mean
95% is within ±1.96sd from the mean
99% is within ±2.58sd from the mean
Proportions to probability 
Proportions are always the same in a normal distribution. If we know smth is normally distributed, we are able to know something about the probability
Working out proportions - standardisation
Transforming any distribution to one with mean = 0 and sd = 1, aka transforming variables into z-scores. Do this by subtracting each score from the mean and dividing by the standard deviation
Working out proportions - probability of z-score
Use z-table or R to see probability of score e.g. z-score -1.75 has a probability of around 4%. So 4% of variables are below this score, and 96% are above
using R to work out proportions (Charlie social events e.g.)
Using original scores and distribution properties;
charlie_events = 57
pnorm(charlie_events, mean = 127, sd = 40, lower.tail = FALSE)
Using z-scores and standard normal distribution;
charlie_z = -1.75
pnorm(charlie_z, mean = 0, sd = 1, lower.tail = FALSE)
Critical value 
A value that cuts off a specific proportion of a distribution, e.g. the top 5%
Working out critical values - Charlie social events e.g. 
Work backwards; find z-score corresponding to probability of 0.95, then transform z-score to og score
Using r;
qnorm(p = 0.95, mean = 127, sd = 40)
Sampling from distributions 
Collect data on variable = randomly sampling from distribution. Many variables come from normal distribution, some may come from other types... e.g.
Reaction times - log-normal distribution
Annual casualties due to horse kicks - Poisson distribution (only deals w/ integers, i.e. can't have .4 of a death)
Passes/fails on exam - binomial distribution
Sampling more people 
Samples from the same population will be different from each other. If took many samples and each time calculated the mean, they would have their own distribution.
Sampling distribution (of the mean) 
Distribution of the means of many samples of a particular size. Distribution is normal and centred around the true population mean.
Central Limit Theorem
As n gets larger, the sampling distribution of the mean tends towards a normal distribution with population mean