Statistical testing

Cards (24)

  • Descriptive statistics
    mean, median, mode, range, standard deviation, graphs
  • INFERential statistics
    • making INFERences from data collected to make generalisations to the population
    • E.g., a newspaper reports that “women are better at driving than men.”. If these results are significant based on the INFERential tests, then we can INFER that this is the same for all women and men.
  • Example experiment
    • Our experiment – impact of mindfulness on levels of anxiety
    • Hypothesis – there will be a difference in the 25 participants anxiety scores before and after a 6 week mindfulness course – is this directional or non-directional?
    • Remember: every study has 2 hypothesis - Original (also called alternate or experimental) hypothesis- there will be a difference in the 25 participants anxiety scores before and after a 6 week mindfulness course
    • Null hypothesis: There will be no difference in anxiety scores before and after the 6 week mindfulness course
  • Statistical testing tells us whether our results are significant or whether they are down to chance
  • Probability (1)
    • We determine significance by working out what the probability is of results being down to chance or being down to the thing that we have changed e.g. manipulation of the IV.
    • All studies employ significance level in order to check for significant differences or relationships
    • The significance level that is generally and most commonly accepted is p 0.05. At this level the alternate hypothesis is accepted and the null hypothesis is rejected.
  • Probability (2)
    • It means that there was less than 5% probability that results occurred by chance and therefore there is a 95% probability that results occurred because of manipulation of the IV (p≤0.05)
    • Some studies employ 0.01 (1%) (particularly drugs trials) where the margin of error needs to be smaller.
  • Probability example:
    • Pregnancy tests are not 100% reliable and so women are often encouraged to take more than one to be more certain.
    • If a woman took 100 tests and 95 of these were positive and 5 of these were negative we could say that there is a 95% certainty that she is pregnant!
    • This would be an acceptable margin of error if we’ve decided a significance level of 0.05 is appropriate.
    • This would be statistically significant - P is equal or less than 0.05 (P≤0.05).
  • A Type I error
    • If significance level is too lenient e.g 0.1 (10%) it can result in the null hypothesis being rejected when in fact it is the case (optimistic error or a false positive) as the researcher claims to have found a significant difference/correlation when one does not exist.
  • A Type II error
    • If significance level is too stringent e.g. 0.01 (1%), it is possible that is the null hypothesis will be accepted when in fact it is false (pessimistic error or a false negative).
  • Importance of choosing the right significance level
    • Due to the fact that researchers can never be 100% certain that they have found statistical significance, it is possible that the wrong hypothesis might be accepted.
    • The 0.05 (5%) level of significance balances the risk of making a Type I or a Type II error.
  • For us to be able to establish significance, we need to be able to read critical values table – these will ALWAYS be given in exam
    • To use the critical value tables, you will need several bits of information:
    1. ‘N’ value – number of p’s
    2. The calculated value – comes from statistics test (nearly always given to you and can be represented by different letters depending on the test that has been used – S, M, T U etc)
    3. The significance level – (0.05 unless told otherwise)
    4. Is test one tailed or two – (directional or non directional)
    5. The critical value – you will find in the table
  • Calculated value of S must be =< the critical value from the table to be significant
  • Chi-Squared
    Test of association with nominal data using an unrelated design.
  • Sign test
    Test of difference with nominal data using a related design.
  • Mann–Whitney
    Test of difference with ordinal data using an unrelated design.
  • Spearman’s rho
    Test for correlation with ordinal data
  • t-test (unrelated and related)
    Test of difference with interval data with related or unrelated design.
  • Wilcoxon
    Test of difference with ordinal data using a related design.
  • Pearson’s r
    Test of correlation with interval data.
  • Nominal - Data is allocated to mutually exclusive categories and is discrete as it can only appear in one category.
  • Interval - Like ordinal data but based on numerical scales that include units of equal, precisely defined size
  • Ordinal - Data is ordered in some often in the form of a scale which consists of ratings/rankings.
  • C S C M W S U R P test table
    Test table
  • The critical value of (*symbol for test) when N = (*number of participants) for a (*1 or 2) tailed test at p ≤ (*significant level) is (*critical value based on look up tables).
    As the calculated value of (*symbol for test) is (*greater or less than) the critical value, we can (*accept or reject) the null hypothesis.
    Therefore, we can conclude that the results are significant/not significant