chi-square

Cards (34)

  • Measures of Association
    Statistical techniques used to determine the strength of the relationship between two categorical variables
  • Chi-Square
    A statistical test used to determine if there is a significant relationship between two categorical variables
  • Pearson's Chi-Square test or χ2
    1. Measure relationship/association between 2 categorical variables
    2. Enables us to see whether the frequency counts obtained are significantly different from the frequency counts expected by chance
  • Pearson's Chi-Square test or χ2

    • Has a distribution with known properties called the chi-square distribution
    • Shape determined by the degrees of freedom (r-1)(c-1)
    • r is the number of rows, c is the number of columns
  • Pearson's Chi-Square test or χ2 (by hand)
    1. Find a critical value for the chi-square distribution with the specified df
    2. If the observed chi-square statistic is bigger than the critical value, conclude there is a significant relationship between the two variables
  • Fisher's Exact Test
    Used when the expected frequencies are too low, as the chi-square approximation is not good enough
  • Likelihood Ratio
    • An alternative to Pearson's chi-square, based on maximum-likelihood theory
    • Preferred when samples are small
  • Yates' Correction for Continuity
    Adjustment to the Pearson's chi-square formula to account for small expected frequencies in 2x2 contingency tables
  • Phi
    Measure of association accurate for 2x2 contingency tables
  • Contingency Coefficient
    Measure of association that ensures a value between 0 and 1
  • Cramer's V
    Measure of association that can attain a maximum of 1, most useful when variables have more than two categories
  • Loglinear Analysis
    Technique for analyzing associations between several categorical variables
  • Assumptions when analyzing categorical data
    • Independence
    • Expected Frequencies
  • Small differences in cell frequencies can result in statistically significant associations between variables if the sample is large enough
  • Types of Chi-Square Tests
    • One-variable χ2 (goodness-of-fit test)
    • χ2 test for independence: 2 × 2
    • χ2 test for independence: r × c
  • One-Variable χ2 or goodness-of-fit test

    Enables us to discover whether a set of obtained frequencies differs from an expected set of frequencies
  • Example: preference for cat breeds
    • Bengal
    • Ragdoll
    • Siamese
    • Sphynx
  • Observed frequencies
    The numbers that are found in the various categories
  • Expected frequencies
    The numbers that are expected to be found in the categories, if the null hypothesis is true
  • Calculating Chi-Square
    1. Take the expected frequencies away from the observed frequencies
    2. Square all the numbers
    3. Divide these figures by a measure of variance (expected frequencies)
    4. Add the figures
  • Degrees of freedom (DF) are usually reported along with the chi-square value and associated probability level
  • When entering data in SPSS and performing chi-square, cases must be weighted by the frequency count
  • χ2
    Compares the observed frequencies with the expected frequencies
  • Finding out whether observed and expected frequencies are similar
    1. Take the expected frequencies away from the observed frequencies
    2. Square all the numbers
    3. Divide these figures by a measure of variance (expected frequencies)
    4. Add the figures
  • Degrees of freedom (DF)

    Usual to report the DF along with the chi-square value and associated probability level
  • Doing it in SPSS
    1. Weight cases by the frequency count
    2. Go to Data, then weight cases
    3. Move Frequency from the left box to the Frequency Variable box
    4. Click on Analyze, Nonparametric tests, Legacy Dialogs and Chi-square
  • χ2 test for independence: 2 X 2

    Discovers whether there is a relationship or association between 2 categorical variables
  • Doing it in SPSS for 2 X 2 table
    1. Weight cases first
    2. Choose Analyze, Descriptive Statistics and Crosstabs
    3. Move 'drink' variable to the row box and 'smoke' variable to the column box
    4. Click Statistics and check Chi-Square and Cramer's V
    5. Check Expected and Observed in Cell display
  • If the assumption of χ2 is broken (>25% of cells have expected frequency <5), Fisher's Exact Probability Test is calculated instead
  • Degrees of Freedom in χ2
    • DF = (number of rows minus 1) X (number of columns minus 1)
    • A 2 X 2 will always have a DF of 1
  • χ2 test for independence: r × c

    Works the same way as 2 X 2, but DF calculation is different
  • Effect Size
    • Cramér's V is an adequate effect size measure
    • Odds ratio is a more common and useful measure of effect size for categorical data
  • Assumptions for χ2:
  • χ2 test
    • The sampling distribution of χ2 is always positive (one-tailed)
    • Hypothesis can be two-tailed (no direction predicted) or one-tailed (direction predicted based on previous theory)