Descriptive stats

Cards (24)

  • What is a population
    An entire group of people we are interested in
  • What is a sample
    A subset of our population
  • How is sample size normally represented 

    Represented with n
  • Describes categorical data
    Categories of data that have no correlation or hierarchal means e.g. height and eye colour
  • Describe what it is meant by discrete data
    Ordinal, ratio or interval data that has fixed value with a logical order e.g. rating happiness levels on a scale from 1-10
  • describe what it is meant by continuous data
    Usually a ratio or interval that can take any fractional value e.g. reaction times
  • How is the median calculated with odd value datasets
    (n+1)/2
  • Pros of using median data
    Insensitive to outliers
    often gives real, meaningful data value
    useful for ordinal data, and skewed interval/ratio data
  • Cons of using the median
    ignores alot of the data
    difficult to calculate without a computer
    cant use this with nominal data
  • Pros of using the mean
    uses all the data
    is most effective for normally distributed datasets
  • Cons of using the mean
    sensitive to outliers
    values are not always meaningful
    only meaningful for ratio or interval data
  • Describe the interquartile range
    is the range of scores within the middle 50% of scores
  • How to calculate the interquartile range
    The upper quartile range - the lower quartile range
  • Describe what deviance is 

    each score is subtracted from the mean
    could see a deviance of 0
  • what is it meant by the sum of squared errors
    deviance is squared and all deviances are summed
    more data points = bigger SS
  • what is it meant by variance
    an average of our sum of squares
  • Pros of variance
    uses all the data
    forms the basis of serval other tests/statistics
  • cons of variance
    requires a normal distribution
    sensitive to outliers
    units are not sensible
  • what is it meant by standard deviation
    A measure of spread that is equal to the unit of measurement of the dependent variable
  • how is SD calculated 

    Using the square root of variance
  • when can we use SD's
    to measure a population or estimate SD of a population based on a sample
  • why do we use SD's
    allows us to get an unbiased estimate of the population SD if we only have access to a sample of the data
  • How to calculate the variance
    Find out the mean of the sample set, then subtract each data set from the mean and square the values and finally then find the average of the squared values.
  • How to calculate the variance
    square root the variance