2.3 Statistics for Two Categorical Variables

    Cards (60)

    • Categorical variables are variables whose values can be grouped into distinct categories or classes
    • What data is needed to create a contingency table?
      Two categorical variables
    • Conditional probability is the probability of an event occurring given that another event has already occurred
    • What is a contingency table used for?
      Organize data on categorical variables
    • A contingency table can only have two categories for each variable
      False
    • What is the first step in testing for statistical independence using a contingency table?
      Calculate conditional probabilities
    • Steps to calculate expected frequencies
      1️⃣ Identify the row and column totals
      2️⃣ Multiply the row and column totals
      3️⃣ Divide the result by the grand total
    • Steps to calculate the expected frequency for males who like math in the given example:
      1️⃣ Identify the row total for males
      2️⃣ Identify the column total for likes math
      3️⃣ Identify the grand total
      4️⃣ Apply the formula: (Row Total × Column Total) / Grand Total
    • What is the purpose of a contingency table?
      Organize categorical data
    • The probability that a student is male given they prefer math is calculated as P(Male|Math) = 45 / 95
    • What is the probability that a male likes math, given the data in the contingency table?
      0.45
    • What do expected frequencies represent in a contingency table?
      Frequencies under independence
    • Expected frequencies are calculated under the assumption that categorical variables have no relationship.

      True
    • The Chi-Square test for independence assesses whether two categorical variables are statistically independent.
      True
    • How are degrees of freedom calculated in the Chi-Square test?
      (Rows - 1) × (Columns - 1)
    • With \( df = 2 \) and a p-value < 0.05, we conclude that gender and preferred subject are dependent.

      True
    • Match the type of categorical variable with its characteristic:
      Nominal ↔️ No inherent order
      Ordinal ↔️ Categories have a specific order
    • What is the formula for conditional probability denoted as?
      P(A|B)
    • What is the purpose of a contingency table?
      Organize categorical data
    • Steps to create a contingency table
      1️⃣ Identify the two categorical variables
      2️⃣ List the categories for rows and columns
      3️⃣ Record the count in each cell
      4️⃣ Include row and column totals
    • Two categorical variables are statistically independent if P(A|B) = P(A)
      True
    • The formula to calculate expected frequency for a cell is (Row Total × Column Total) / Grand Total
    • Expected frequencies represent the hypothetical number of observations in a contingency table if the variables are statistically independent.

      True
    • Match the type of categorical variable with its characteristic:
      Nominal ↔️ No inherent order
      Ordinal ↔️ Specific order or ranking
    • What does the notation P(A|B) represent in conditional probability?
      Probability of A given B
    • In the example provided, gender and math preference are statistically dependent
    • Gender and math preference are statistically independent based on the given probabilities.
      False
    • The denominator in the formula for calculating expected frequencies is the Grand Total.
    • The expected frequency for females who like English is 35.
    • Observed frequencies in the Chi-Square test are denoted as \( O_{ij} \).
      True
    • What is the approximate value of the Chi-Square test statistic in the example provided?
      10.54
    • Nominal variables have categories that are distinct but have no inherent order
    • A contingency table shows how many male and female students prefer subjects
    • Match the categorical variable type with its characteristic:
      Nominal ↔️ No inherent order
      Ordinal ↔️ Specific order or ranking
    • What is the first step in creating a contingency table?
      Identify the two categorical variables
    • To compute conditional probabilities from a contingency table, you divide the cell count by the total count of the condition
    • What do expected frequencies represent in a contingency table?
      Frequencies under independence
    • The formula to calculate the expected frequency for a cell is (Row Total × Column Total) / Grand Total
    • Nominal variables have categories with no inherent order
    • Steps to create a contingency table:
      1️⃣ Identify the two categorical variables
      2️⃣ List the categories for each variable as rows and columns
      3️⃣ Record the count for each cell
      4️⃣ Include row and column totals
    See similar decks