Stat 3

    Cards (24)

    • Measures of Central Tendency:
      • Mean: arithmetic average calculated by summing up all values and dividing by the number of observations
      • Median: middle value of a dataset when arranged in ascending or descending order
      • Mode: most frequently occurring value in a dataset
    • Measures of Central Tendency provide a single, representative value that summarizes the center or typical value of a dataset
    • Common measures of central tendency are mean, median, and mode
    • Mean is affected by extreme values (outliers) and is a good measure when the data set is normally distributed
    • Median is less sensitive to extreme values and is a good measure when the data set has outliers
    • Mode is not affected by extreme values and can be used for numerical or categorical data
    • Measures of Variation:
      • Variance: average of squared deviations of values from the mean
      • Standard Deviation: square root of the variance
    • Measures of Variation provide information on the spread, consistency, variability, or dispersion of the data values
    • Common measures of variation include range, variance, and standard deviation
    • Range is the simplest measure of variation, calculated as the difference between the largest and smallest values
    • Variance and Standard Deviation show variation about the mean and have the same units as the original data
    • Measures of Relative Position:
      • Quartiles: split the ranked data into 4 segments with an equal number of values per segment
      • Percentiles: divide the data into 100 equal parts, each representing a percentage
    • Measures of Relative Position describe the location of a particular data point within a dataset relative to other data points
    • Quartiles include Q1, Q2 (median), and Q3, with the Interquartile Range (IQR) measuring the spread in the middle 50% of the data
    • Percentiles divide the data into 100 equal parts, indicating the point below which a certain percentage of data falls
    • Box-and-whiskers plot visually displays the distribution of a dataset, highlighting the five-number summary and any potential outliers
    • Boxplot includes the smallest value, Q1, median (Q2), Q3, and largest value, showing the distribution shape of the data
    • Shape of a Distribution
      • Describes how data are distributed
      Two useful shape related statistics are:
      ➢Skewness
      ✓Measures the extent to which data values are not symmetrical
      ➢Kurtosis
      ✓Kurtosis affects the peakedness of the curve of the
      distribution—that is, how sharply the curve rises approaching
      the center of the distribution
    • Outlier: An observation that lies far from most of the others in a sample or population
    • Measures of Variation
      • Provide information on the spread or consistency or variability or
      dispersion of the data values.
    • Why The Range Can Be Misleading
      • Does not account for how the data are distributed
      • Sensitive to outliers
    • Coefficient of Variation
      Measures relative variation
      Always in percentage (%)
      Shows variation relative to mean
      Can be used to compare the variability of two or more sets of data
      measured in different units
    • Quartile Measures:
      The Interquartile Range (IQR)
      • The IQR measures the spread in the middle 50% of the data
      • The IQR is also called the midspread because it covers the middle 50%
      of the data
      • The IQR is a measure of variability that is not influenced by outliers or
      extreme values
    • Shape of Boxplots
      • If data are symmetric around the median then the box and central line
      are centered between the endpoints
      • A Boxplot can be shown in either a vertical or horizontal orientation
    See similar decks