from the onl test

Cards (20)

    • ype I error (α): This occurs when the null hypothesis is true, but you incorrectly reject it. The probability of making a Type I error is denoted by the significance level α\alphaα.
    • Type II error (β): This occurs when the null hypothesis is false, but you fail to reject it.
    • Power (1 - β): This is the probability of correctly rejecting the null hypothesis when it is false
    • The probability you reject the null hypothesis when in fact the null hypothesis is true is called _
    • Discrete data: This type of data consists of distinct, separate values that can be counted. Discrete data often involves integers and is often used to count items, such as the number of colors in a box of crayons.
    • Continuous data: This type of data can take any value within a given range and is often associated with measurements, such as height, weight, or temperature, which can have fractional or decimal values.
    • Neither: This option would apply if the data does not fit into the categories of discrete or continuous, but in this case, the data clearly fits as discrete.
  • Nominal
    The most basic level, which involves naming or labeling data without any quantitative value. Examples include gender, race, or types of animals.
  • Ordinal
    This level involves ordering or ranking items, but the intervals between the items are not necessarily equal. Examples include class rankings or levels of satisfaction.
  • Interval
    This level involves ordered data with equal intervals between values, but there is no true zero point. Examples include temperature in Celsius or Fahrenheit.
  • Ratio
    This is the highest level of measurement and involves ordered data with equal intervals and a true zero point, which allows for meaningful comparisons of magnitudes. Examples include height, weight, and monthly amounts of rain.
  • Nominal
    • gender, race, types of animals
  • Ordinal
    • class rankings, levels of satisfaction
  • Interval
    • temperature in Celsius or Fahrenheit
  • Ratio
    • height, weight, monthly amounts of rain
    • Unimodal: Only one mode.
    • Bimodal: Two modes.
    • Multimodal: More than two modes.
    • No mode: No value repeats.
  • z-score is the difference between the mean and the standard deviation of a score divided by the standard deviation. The z-score closer to 0 indicates a value closer to the mean (specifications).
  • Frequency Table
    A frequency table lists the values in a dataset and shows how often each value occurs. It consists of two columns:
    1. Value (or Category): The different values or categories present in the dataset.
    2. Frequency: The number of times each value or category appears in the dataset.
  • Relative Frequency Table
    A relative frequency table also lists the values in a dataset, but it shows the proportion (or percentage) of the total number of observations that each value represents. It consists of two columns:
    Value (or Category): The different values or categories present in the dataset.
    Relative Frequency: The frequency of each value divided by the total number of observations.
  • marginal distribution = row totals
    • Correlation: This refers to the relationship between two variables. A positive correlation means as one variable increases, the other tends to increase as well. Conversely, a negative correlation indicates that as one variable increases, the other tends to decrease.
    • Causation: This implies that one variable directly influences the other. Correlation doesn't necessarily prove causation. Other factors might be affecting both variables.
  • Linear Regression:
    • This is a statistical method to model the relationship between a dependent variable (what you're trying to predict) and one or more independent variables (factors you think influence the dependent variable).
    • In this case, the dependent variable is Sales and the independent variable is Number of Sales People Working.
    • Linear regression fits a straight line (regression line) through the data points to estimate the dependent variable based on the independent variable.
    • Slope (b1): This represents the average change in the dependent variable (Sales) for every one-unit increase in the independent variable (Number of Sales People Working).
    • Intercept (bo): This is the predicted value of the dependent variable (Sales) when the independent variable (Number of Sales People Working) is zero.
    • Regression Equation: This equation expresses the relationship between the dependent and independent variables. It typically follows the form:
    Dependent Variable = b1 * Independent Variable + bo
  • regression equation: Dependent Variable = b1 * Independent Variable + b0
  • sum(x)
    mean(x)
    median(x)
    quantile(x)
    rank(x)
    var(x) The variance.
    sd(x) The standard deviation.