Data handling and analysis

Cards (65)

  • What is Qualitative Data?

    Data that is expressed in words and is non-numerical
  • What is Quantitative Data?
    Data that is numerical
  • What are qualitative methods of data collection?
    Interview or unstructured observations
  • Quantitative methods of data collection
    Questionnaires, observations
  • What is Primary data?
    Information that has been obtained first hand by a researcher for the purposes of a research project
  • What is Secondary Data?

    Information that has already been collected by someone else
  • Examples of primary data
    Questionnaires, interviews or observations
  • Examples of secondary data
    Journal articles, book or websites
  • AO3 Qualitative Data
    • Qualitative data offers more detail than quantitative data. It is much broader in scope and gives the PP the opportunity to fully report their thoughts and opinions on a given subject
    • Tends to have greater external validity as it provides the researcher with more meaningful insight
    • Qualitative data is often difficult to analyse as it can't be summarised statistically so that patterns and comparisons within and between data may be hard to identify
    • Conclusions often rely on the subjective interpretations of the researcher and these may be subject to bias
  • AO3 Quantitative Data
    • Quantitative data is relatively simple to analyse so the comparisons between groups can be easily drawn
    • Data in numerical form tends to be more objective and less open to bias
    • Quantitative data is much narrower in meaning and detail than qualitative data so may fail to represent ‘real life’.
  • AO3 Primary Data
    • The main strength of primary data is that it is authentic data obtained from the PPs themselves for the purpose of a particular investigation
    • Questionnaires and interviews can be designed in such a way that they specifically target the information that the researcher requires
    • To produce primary data it requires time and effort on the part of the researcher
    • Conducting an experiment requires considerable planning, preparation and resources compared to secondary data which may be accessed within a matter of minutes
  • AO3 Secondary Data
    • Secondary data may be inexpensive and easily accessed requiring minimal effort
    • When examining secondary data the researcher may find that the desired information already exists and so there is no need to conduct primary data collection
    • There may be substantial variation in the quality and accuracy of secondary data so information may appear to be valuable but may be outdated or incomplete
    • The content of the data may not quite match the researcher’s objectives so this may challenge the validity of any conclusions
  • Meta Analysis
    • The process of combining the findings from a number of studies on a particular topic with the aim being to produce an overall statistical conclusion based on a range of studies.
  • AO3 Meta Analysis
    • A form of research method that uses secondary data is meta-analysis
    • This refers to a process in which a number of studies are identified which have investigated the same aims
    • The results of these studies can be pooled together and a joint conclusion produced
    • Meta-analysis allows us to create a larger, more varied sample and results can then be generalised across much larger populations increasing validity
    • Meta-analysis may be prone to publication bias as researchers may not select all relevant studies choosing to leave out those studies with negative results
  • Descriptive Statistics
    The use of graphs, tables and summary statistics to identify trends and analyse sets of data
  • Measures of central tendency
    Any measure of the average value in a set of data e.g mean, median and mode
  • Mean
    Adding all the values and then divide by how many data sets there are
  • Median
    Ranking all the data values from lowest to highest and finding the middle value
  • Mode
    The most frequently occurring value in a set of data
  • AO3: The Mean
    • The mean is the most sensitive of the measures of central tendency as it includes all of the values in the data set within the calculation
    • This means it is more representative of the data as a whole
    • The mean is easily distorted by extreme values and does not then represent the data as a whole
  • AO3: The Median
    • The strength of the median is that extreme scores do not affect it
    • It is also easy to calculate (once you have arranged the numbers in order)
    • It is less sensitive than the mean as the actual values of lower and higher numbers are ignored and extreme values may be important
  • AO3: The Mode
    • Although the mode is easy to calculate it is not that helpful
    • When there are several modes in a data set this is then not a very useful piece of information
    • For some data such as data in categories the mode is the only method you can use
    • For example, if you asked your class to list their favourite dessert the only way to identify the most ‘typical’ or average value would be to select the modal group
  • Study Tip to know which measure of central tendency to use
    • If you have to decide what method of central tendency should be used with a particular set of data consider whether there are any extreme scores- a score that is significantly lower or higher than the others
    • If there are no extreme scores then the mean is the best option as it is the most sensitive measure of the three
    • However, if there is an extreme score, the median is most suitable as the mean would become distorted
    • The mode is never the best option, except if the data is in categories
  • Measures of dispersion
    Term for any measure of spread or variation in a set of scores
  • Range
    A simple calculation of the dispersion in a set of scores which is worked out by subtracting the lowest score from the highest score and usually adding 1
  • Standard Deviation
    A sophisticated measure of dispersion in a set of scores which tells us how much each score deviates from the mean
  • AO3: The Range
    • The advantage of the range is that it is easy to calculate
    • It only takes into account the two most extreme values, and this may be unrepresentative of the data set as a whole
    • The range also does not indicate whether most numbers are closely grouped around the mean or spread out – whereas the standard deviation does show this aspect of dispersion
  • AO3: The Standard Deviation
    • Is a single value that tells us how far scores deviate from the mean so the larger the standard deviation the greater the spread within a set of data
    • A large standard deviation may suggest a few anomalous results due to the big spread of data and a low standard deviation value may reflect that the data is tightly clustered around the mean which might imply that PPs responded in a similar way
    • The standard deviation is a much more precise measure of dispersion as it includes all values within the calculation
    • But it can be distorted by extreme values
  • Scattergram
    A type of graph that represents the strength and direction of the relationship between co-variables in a correlational analysis
  • Bar Chart
    A type of graph in which the frequency of each variable is represented by the height of the bars
  • How to draw a bar chart?
    • Bar charts are used when data is divided into categories
    • The categories occupy the horizontal x-axis
    • The frequency or amount of each category is plotted on the vertical y-axis
    • Bars are separated on a bar chart to denote that we are dealing with separate conditions
  • How to draw a histogram?
    • The bars touch each other which shows that x-axis data is continuous
    • The x-axis is made up of equal-sized intervals of a single category
    • The y-axis represents the frequency within each interval. If there was a zero frequency for one of the intervals the interval remains but without a bar
  • When drawing a graph what are some main things you need?
    • The x and y axis labelled
    • Title
    • Make sure intervals are all correct
  • Normal Distribution
    • Normal distribution is a bell shaped curve which is symmetrical
    • Most people are located in the middle area of the curve with very few people at the extreme ends
    • The mean, median and mode all occupy the same midpoint of the curve
  • Normal Distribution
    Normal distribution
  • Positive Skew
    • Where most of the distribution is concentrated towards the left of the graph, resulting in a long tail on the right
    • The mean remains at the highest point of the peak
    • The median comes next
    • Last is the mode
  • Negative Skew
    • Long tail is on the left hand side with most of the data distributed on the right
    • The mode is the highest
    • Then the median
    • The lowest is the mean
  • Correlation
    Investigating an association between 2 co variables
  • What are correlations plotted on?
    Scatter graphs
  • What values can a correlation be?
    Between -1 and 1