Stat 3

Cards (24)

  • Measures of Central Tendency:
    • Mean: arithmetic average calculated by summing up all values and dividing by the number of observations
    • Median: middle value of a dataset when arranged in ascending or descending order
    • Mode: most frequently occurring value in a dataset
  • Measures of Central Tendency provide a single, representative value that summarizes the center or typical value of a dataset
  • Common measures of central tendency are mean, median, and mode
  • Mean is affected by extreme values (outliers) and is a good measure when the data set is normally distributed
  • Median is less sensitive to extreme values and is a good measure when the data set has outliers
  • Mode is not affected by extreme values and can be used for numerical or categorical data
  • Measures of Variation:
    • Variance: average of squared deviations of values from the mean
    • Standard Deviation: square root of the variance
  • Measures of Variation provide information on the spread, consistency, variability, or dispersion of the data values
  • Common measures of variation include range, variance, and standard deviation
  • Range is the simplest measure of variation, calculated as the difference between the largest and smallest values
  • Variance and Standard Deviation show variation about the mean and have the same units as the original data
  • Measures of Relative Position:
    • Quartiles: split the ranked data into 4 segments with an equal number of values per segment
    • Percentiles: divide the data into 100 equal parts, each representing a percentage
  • Measures of Relative Position describe the location of a particular data point within a dataset relative to other data points
  • Quartiles include Q1, Q2 (median), and Q3, with the Interquartile Range (IQR) measuring the spread in the middle 50% of the data
  • Percentiles divide the data into 100 equal parts, indicating the point below which a certain percentage of data falls
  • Box-and-whiskers plot visually displays the distribution of a dataset, highlighting the five-number summary and any potential outliers
  • Boxplot includes the smallest value, Q1, median (Q2), Q3, and largest value, showing the distribution shape of the data
  • Shape of a Distribution
    • Describes how data are distributed
    Two useful shape related statistics are:
    ➢Skewness
    ✓Measures the extent to which data values are not symmetrical
    ➢Kurtosis
    ✓Kurtosis affects the peakedness of the curve of the
    distribution—that is, how sharply the curve rises approaching
    the center of the distribution
  • Outlier: An observation that lies far from most of the others in a sample or population
  • Measures of Variation
    • Provide information on the spread or consistency or variability or
    dispersion of the data values.
  • Why The Range Can Be Misleading
    • Does not account for how the data are distributed
    • Sensitive to outliers
  • Coefficient of Variation
    Measures relative variation
    Always in percentage (%)
    Shows variation relative to mean
    Can be used to compare the variability of two or more sets of data
    measured in different units
  • Quartile Measures:
    The Interquartile Range (IQR)
    • The IQR measures the spread in the middle 50% of the data
    • The IQR is also called the midspread because it covers the middle 50%
    of the data
    • The IQR is a measure of variability that is not influenced by outliers or
    extreme values
  • Shape of Boxplots
    • If data are symmetric around the median then the box and central line
    are centered between the endpoints
    • A Boxplot can be shown in either a vertical or horizontal orientation