Stats

Cards (42)

  • Hypothesis Testing
    is a process wherein we make decisions in evaluating claims about the population based on the characteristics of a sample taken from the same population.
  • hypothesis
    is a proposed explanation, assertion, or assumption about a population parameter or about the distribution of a random variable.
  • 6 Steps of Hypothesis Testing
    1.State the null and alternative hypotheses. 2. Select the level of significance. 3. Statistical Tool (Z-test, T-Test, CLT) 4. Formulate the decision rule. 5. Compute the value of the test statistics. 6. Decision Rule.
  • Null Hypothesis

    is a statement about the value of a population parameter formulated with the hope of it being rejected. It is usually denoted by Ho.
  • One Tailed Test
    Greater Than
  • Null Hypothesis
    always involves an equality symbol
  • alternative hypothesis

    contains the inequality symbol
  • Type I Error
  • Type I Error
    Reject Null if its is True
  • Type II Error

    Fail to reject null if it is false
  • Correct Decision
    Fail to Reject Ho, when it is true
  • Correct Decision
    Reject Ho if it is false
  • level of significance
    the probability of rejecting the null hypothesis when it is true.
  • Test Statistic
    Any function of the observed data whose numerical value dictates whether the null hypothesis is accepted or rejected
  • Decision Rule
    A test hypothesis where the alternative hypothesis is one-sided is called a one-tailed test. If the alternative hypothesis is two-sided, then we call it a two-tailed test.
    The set of values of the test statistic that result in the rejection of the null hypothesis is called the critical region or the region of rejection. The particular point in the critical region that separates the rejection region with the acceptance region is called the critical value.
  • Population
    A population includes all the elements from a set of data
  • Population Mean
    mean of all values in the population. If the sample is randomly selected and sample size is large, then the sample mean would be a good estimate of the population mean
  • A sample
    consists of one or more observations drawn from the population.
  • POPULATION STANDARD DEVIATION

    Is a parameter which is a measure of variability with fixed value calculated from every individual in the population.
  • Population Variance
    indicates how the population data points are spread out. It is the average of the distances from each data point in the population to the mean, squared
  • Sample Standard Deviation
    statistic which means that this measure of variability is calculated from only some of the individuals in a population.
  • z-test

    used when the data are normally distributed, the sample size is greater than or equal to 30 (𝑛 ≥ 30) and the population variance is known. z=z =xμ0σ/n \frac{{x - μ₀}}{{σ/√n}}
  • t-test

    used when the population variance or standard deviation are not known. When the variance is unknown and sample size is less than 30 ( 𝒏 < 𝟑𝟎 ).
  • central limit theorem
    states that if you have a population with mean and standard deviation and take sufficiently large random samples from the population, then the distribution of the sample means will be approximately normally distributed. Hence, as the sample size gets larger, the data also approaches a normal distribution. In this case, we can use the z –test, and if the population variance/ population standard deviation is unknown, we can use the sample variance instead.
  • Hypothesis Testing Steps 4 to 6

    1. Step 4: Formulate a decision rule
    2. Step 5: Compute the test statistic
    3. Step 6: Make a Decision
  • Null hypothesis
    The hypothesis that is assumed to be true until evidence indicates otherwise
  • Alternative hypothesis
    The hypothesis that is accepted if the null hypothesis is rejected
  • Significance level

    • Usual significance level in research or social science research is 5% (α = 0.05)
    • In health and other applied sciences, 1% significance level (α = 0.01) is used
    • If the alternative hypothesis requires the two-tailed test (≠), the alpha is divided by 2 later in determining the critical region
  • Test statistic
    The value calculated from the sample data that is used to determine whether to reject or not reject the null hypothesis
  • The rejection region or (critical region) is the set of all values of the test statistic that causes us to reject the null hypothesis
  • The non-rejection region (or acceptance region) is the set of all values of the test statistic that causes us to fail to reject the null hypothesis
  • The critical value is a point (boundary) on the test distribution that is compared to the test statistic to determine if the null hypothesis would be rejected
  • Univariate data

    Data that involve one variable
  • Bivariate data

    Data that involve two variables
  • Correlation analysis
    The statistical procedure used to determine and describe the relationship between two variables
  • Scatter plot
    Graph of two variables in a rectangular coordinate plane displaying a relationship between the two variables
  • Input variable (x)

    Independent variable, cannot be affected by other variable
  • Output variable (y)

    Dependent variable, results from the controlled variable, affected by changes in the independent variable
  • Scatter plot
    • Form (shape) determines the shape of the correlation of the variables
    • Trend (direction) determines the direction of the points, either the variables have positive, negative, or no correlation
    • Variation (strength) determines whether the variables have no, weak, moderate, strong, or perfect correlation
  • Pearson's sample correlation coefficient (r)
    A test statistic that measures the strength of the linear relationship between two variables