data analysis

Cards (17)

  • data analysis helps us to summarise and analyse numerical data in order to draw meaningful conclusions
    this can be done by:
    • graphs
    • measures of central tendency
    • measures of dispurtion
  • the measures of central tendency are the mean, median, and mode
  • mean - adding all the numbers in a data set and dividing by the amount of numbers in the data set
    advantages - most sensitive so makes use of all the values
    disadvantages - can be distorted, data can be effected by extreme values (high/low)
  • median - the middle number in the data set (in order)
    advantages - not affected by extreme values
    disadvantages - doesn't take into account the precise value of each observation and doesn't use all the data available
  • mode - most common value in the data set
    advantages - useful when the data is collected in categories e.g. food, colour.
    disadvantages - doesn't consider all the values in the data set, there can be more than one mode
  • measures of dispertion are the range
  • range - the difference between the largest and smallest value
    advantages - its easy to calculate and provides a quick understanding of the data set
    disadvantages - only takes into account the 2 most extreme values so can be unrepresentative
  • standard deviation - measures the variability (spread) of scores around the mean. the higher the standard deviation the greater the diversity of scores
  • low standard deviation will have a "thin" curve on a graph, this means all the scores are relatively close to the mean average
  • high standard deviation will have a "fat" curve on a graph, meaning there is more of a spread of scores that are further away from the mean average
  • high standard deviation is bad because it means the results are less valid because the range is bigger.
    this could be because the participants weren't all affected by the independent variable in the same way or didn't understand the experiment
  • normal distribution - an arrangement of data that is symmetrical and forms a bell shaped curve where the mode, median and mean fall, 50% of the values will be less than and 50% of the values will be greater than the mode/median/mean
  • skewed distribution - in some populations the data is not evenly spread so will be clustered at one end of the graph
    positive skew - the mean is higher than the mode (right)
    negative skew - the mode is higher than the mean (left)
  • table of raw data - data that has just been collected. every graph will be created from a table of raw data (no pattern, numbers and has not been analysed)
  • summary table - raw scores that have been converted into descriptive statistics. for example:
    • mean, mode, median (central tendency)
    • range (dispersion)
  • summary paragraph - states what the numbers show and potential meanings behind them
  • types of graph
    • scattergraph - shows the relationship between 2 variables
    • bar chart - used for comparing two or more values, represents multiple different categories (discrete data)
    • histogram - used for continuous data, one category
    • line graph - shows a trend over multiple hours/days/years