NCMB 315 Week 7

Cards (57)

  • Quantitative data analysis four purposes:
    • To describe data
    • To estimate population values
    • To test hypotheses
    • To provide evidence regarding measurement properties of quantified variables.
  • Levels of measurement
    • Nominal
    • Ordinal
    • Interval
    • Ratio
  • Nominal
    Lowest level, involves using numbers simply to categorize attributes, the numerical value is simply a placeholder
  • Ordinal
    Ranks people on attributes, categories imply some sort of ranking
  • Interval
    Ranks people on an attribute and specifies the distance between them; no true zero value (arbitrary zero)
  • Ratio
    Highest level of measurement, has a meaningful zero and provides information about the absolute magnitude of the attribute
  • Descriptive statistics
    Used to synthesize and describe data, provides simple description and summary about the sample and observations
  • Parameter
    Population value
  • Statistic
    Sample value
  • Univariate
    One variable
  • Symmetrical distribution
    When folded over, the two halves of a frequency polygon would be superimposed
  • Normal distribution
    Bell or normal shaped curve, unimodal, gaussian
  • Asymmetrical distribution
    • (+) skew - longer tail points to the right
    • (-) skew - longer tail points to the left
  • Central tendency
    Provides an overall summary but does not clarify the patterns of data
  • Indexes of central tendency
    • Mode
    • Median
    • Mean
  • Mode
    Most numerical value that occurs most frequently, most popular value
  • Median
    Middle value, does not take into account individual values and is insensitive to extremes
  • Mean
    The sum of all values divided by the number of participants, the most stable
  • Variability (dispersion)

    How the values are different from the mean
  • Range
    The highest minus of the lowest score in a distribution
  • Standard deviation
    Captures the degree to which the scores deviate from one another, shows the homogeneity or heterogeneity of the dataset
  • Bivariate
    Two variables
  • Crosstabulations
    A two-dimensional frequency distribution in which the frequencies of two variables are crosstabulated
  • Correlation
    Used to describe the relationship between two variables - to what extent are the two variables related to each other
  • Pearson's r
    The product-moment correlation coefficient, the most widely used correlation statistic, computed with continuous measures
  • Spearman's Rho
    A correlation index used for ordinal level data or when sample sizes are very small
  • Inferential statistics
    Based on the laws of probability, provide a means for drawing inferences about a population, given data from a sample
  • Parameter estimation
    Used to estimate population parameter - e.g. a mean, a proportion, or a difference in means between two groups
  • Point estimation
    Involves calculating a single statistic to estimate the parameter
  • Interval estimation
    Provides a range of values within which the parameter has a specified probability of lying (dependent on confidence interval)
  • Confidence interval (CI)

    An interval estimation based on confidence level
  • Hypothesis testing

    Type I and Type II errors
  • Type I error
    False-positive (accept), occurs if an investigator rejects a null hypothesis that should be accepted
  • Type II error

    False-negative (reject), occurs if an investigator fails to reject a null hypothesis that should be rejected
  • Confidence interval
    • Provides a range of values within which the parameter has a specified probability of lying (dependent on confidence interval)
    • Used in sampling computation
    • Based on confidence level (most common is 95 confidence level) - can have the same sampling estimation as parameter estimation
  • Parameter estimation
    When the population SD is unknown, the interval estimate can be determined using student's t-distribution
  • Sample problem
    • The mean age of the sample of 25 students is 18 years, and the standard deviation is 1.3 years. Find the interval estimate of the population mean using 95% CL (2.064)
  • Degree of freedom
    n - 1
  • Margin of error formula

    E - margin of arrow, t - statistics, s - sample SD, n - sample size
  • Type II error

    False-negative, occurs if an investigator accepts a null hypothesis that should be rejected