A non-parametric test (sometimes called a “distribution free test”) does not assume anything about the underlying distribution of the data
Typically, we use non-parametric tests when we do not have normally distributed data
When analysing markets, a range of assumptions are made about the rationality of economic agents involved in the transactions
The Wealth of Nations was written
1776
Examples of Non-Parametric Tests
Sign
Mann-Whitney U
Wilcoxon signed-rank
Kruskal-Wallis
Friedman
Different experiments have different numbers of conditions
Sign, Mann-Whitney U, and Wilcoxon signed-rank tests can only examine two conditions
Kruskal-Wallis and Friedman tests can examine three conditions but can only tell you that two conditions differ (not which two)
Different designs require different tests as different assumptions can be made
Wilcoxon signed-rank, sign, and Friedman tests make use of the fact that a repeated measures design has been used, so they can be more powerful
A sign test is a binomial test that is used to determine if there is a median difference between paired or matched observations
A sign test is similar to a t-test, but instead of mean scores, median scores are used
A sign test is used as an alternative to the paired-samples t-test or Wilcoxon signed-rank test when the sample distribution is neither normal nor symmetrical
A Wilcoxon signed rank test is another popular non-parametric test of a difference for matched or paired data
A Wilcoxon signed rank test is more powerful than a sign test as it also takes into account the magnitude of the observed difference
A Wilcoxon signed rank test is used as an alternative to the paired t-test when parametric assumptions have not been met
Wilcoxon Signed Rank Test
Non-parametric test of a difference for matched or paired data
More powerful than a sign test as it also takes into account the magnitude of the observed difference
Used as an alternative to the paired t-test when parametric assumptions have not been met
Hypotheses
Assumptions
Ranking Data
1. Smallest score gets the smallest rank
2. Assign temporaryrank to ties, then work out an averagerank
This is our p value for determining if there is a significant difference between our groups
The Standardized Test Statistic is our z value
Mann-Whitney U Test
Non-parametric test of a difference for between-groups data
Similar to a t-test, but instead of mean scores we are using median scores
Used as an alternative to the independent t-test
Significance tests are used to help us decide between the null and alternative hypotheses
Coolican (2004: 313): '“Significance tests are used to help us decide between the null and alternative hypotheses.”'
We are trying to find out whether the IV caused the change in the DV, or whether the results are just a fluke
p value
The probability that I am rejecting the null hypothesis by mistake
Conventional cut-offs for reporting significance
0.05 (less than 5% chance of error)
0.01 (less than 1% chance of error)
0.001 (less than 0.1% chance of error)
Example of a Type-1 Error: The effects of metal eating on memory
Type-2 Error: Accepting the null hypothesis when the alternative is true
Probability of a type-2 error referred to as beta (β)
Example of a Type-1 Error: A type-2 error resulting from too few observations
We can guard against type-2 errors by working with large samples and taking large numbers of observations from each person
Statistical Power: Increase your sample size
Having lots of trials in an experimental task or lots of items on a questionnaire increases the number of responses from each person
The probability that our test will identify an effect is known as Statistical Power
Importance of Statistical Power
Increase sample size
Less susceptibility to biasing influence of extreme scores
Results in a 'narrower' distribution of scores
Reduce variance in data
Reduce number of extreme scores
Increase significance threshold (alpha)
Increase effect size
Increasing Statistical Power
1. Increase sample size
2. Increase significance threshold
3. Increase effect size
Increasing Statistical Power reduces the risk of a type-2 error but increases the risk of a type-1 error
Effect Size
Do we care about a 1% mean improvement in maths ability by forcing fish oil on children?
Effect sizes
Give additional information about the magnitude of an effect
Provide information about type-2 errors
Effect sizes are necessary to fully understand the value of results