Descriptive Statistics:

Cards (40)

  • What is the focus of descriptive statistics?
    Descriptive statistics focuses on summarizing and describing data.
  • What is a population in statistics?
    A population is the entire group of people we’re interested in.
  • What is a sample in statistics?
    A sample is a subset of our population.
  • How is a sample usually represented in statistics?
    A sample is usually represented with the letter n.
  • What does n represent in statistics?
    n represents our sample size.
  • What types of data are there in statistics?
    There are categorical, discrete, and continuous data.
  • What is categorical data?
    Categorical data has 2 or more categories with no ordering to them.
  • Give an example of categorical data.
    Hair colour is an example of categorical data.
  • What is discrete data?
    Discrete data has a fixed value with a logical order.
  • Give an example of discrete data.
    Shoe size is an example of discrete data.
  • What is continuous data?
    Continuous data can take any fractional value.
  • Give an example of continuous data.
    Reaction times are an example of continuous data.
  • How can categorical data be presented?
    Categorical data can be presented as its raw frequency or as a percentage frequency.
  • How can discrete data be presented?
    Discrete data can be presented as a raw frequency, percentage, or cumulative frequency.
  • What should you do if you have many values in your data?
    You should use frequency ranges to present the data instead.
  • What are the measures of central tendency?
    1. Mode – the score that happens the most often in a dataset
    2. Median – the middle score in a dataset
    3. Mean – the sum of data points divided by the number of data points
  • What is the mode in a dataset?
    The mode is the score that happens the most often in a dataset.
  • For what type of data can the mode be used?
    The mode can be used for nominal data.
  • What are bimodal and multimodal distributions?
    Bimodal distributions have two modes, while multimodal distributions have more than two modes.
  • What is the median in a dataset?
    The median is the middle score in a dataset.
  • How is the median calculated for odd value datasets?
    The median is calculated as (n+1)/2(n+1)/2 for odd value datasets.
  • How is the median calculated for even value datasets?
    The median is calculated as the average of the middle two values.
  • What is an advantage of using the median?
    The median is insensitive to outliers.
  • Why is the median often meaningful?
    The median often gives a real, meaningful data value.
  • For what types of data is the median useful?
    The median is useful for ordinal data and skewed interval/ratio data.
  • What is a disadvantage of using the median?
    The median ignores a lot of the data.
  • What is a challenge when calculating the median?
    It can be hard to calculate the median without a computer.
  • For what type of data can't the median be used?
    The median cannot be used with nominal data.
  • How is the mean calculated?
    The mean is calculated as the sum of data points divided by the number of data points.
  • What is an advantage of using the mean?
    The mean uses all the data.
  • For what type of datasets is the mean most effective?
    The mean is most effective for normally distributed datasets.
  • What is a disadvantage of using the mean?
    The mean is sensitive to outliers.
  • Why might values of the mean not always be meaningful?
    Values of the mean aren’t always meaningful, such as when we cannot get a score of 6.74/106.74/10.
  • For what types of data is the mean meaningful?
    The mean is only meaningful for ratio and interval data.
  • What are the measures of spread related to central tendency?
    • Mode: no measures of spread
    • Median: distance-based measures (range + interquartile range)
    • Mean: center-based measures of spread (variance + standard deviation)
  • How does the interquartile range compare to the range?
    The interquartile range is similar to the range but ignores the most extreme values.
  • What does the interquartile range represent?
    The interquartile range represents the range of scores within the middle 50% of the scores.
  • How is the lower quartile defined?
    The lower quartile is the median of the lower half of the data.
  • How is the interquartile range calculated?
    The interquartile range is calculated as the upper quartile minus the lower quartile.
  • What are deviance and variance in statistics?
    • Deviance: Each score is subtracted from the mean, which could result in a deviance of ‘0’.
    • Variance: An average of the sum of squared errors (SS), which is the sum of squared deviances.