Reliability

Cards (6)

  • Reliability is a measure of consistency, if a particular measurement is repeated and the same result is obtained then that measurement is described as being reliable
    Reliability = consistency.
  • Assessing reliability
    Test-retest: The same test or questionnaire is given to the same person on two or more different occasions. If the test is reliable the results should be the same way each time it is administered.
    Inter-observer: compares observations from different observers.
    In an observation, two or more observers compare their data by conducting a pilot study - a small-scale trial run of the observation to check that observers are applying behavioural categories in the same way. Observers should watch the same event, or sequence of events, but record their data independently.
  • Reliability is measured using a correlation.
    In test-retest and inter-observer reliability, the two sets of scores are correlated.
    The correlation coefficient should exceed +0.80 for reliability. 
  • Improving reliability.
    Questionnaires - a questionnaire that produces low test-retest reliability may need some items to be deselected or rewritten.
    The researcher may replace some open questions (which can be misinterpreted) with closed, fixed choice alternatives which may be less ambiguous
    Interviews - improved training.
    The best way of ensuring reliability in an interview is to use the same interviewer each time.
    If this is not possible, all interviewers must be trained (e.g. so they avoid questions that are leading or ambiguous).
  • Experiments and reliability.
    Lab experiments are often described as being reliable because of strict control over many aspects of the procedure, such as the instructions that the participants receive and the conditions within which they are tested.
  • Observations
    Operationalisation of behavioural categories.
    Behavioural categories should be measurable (e.g. ‘pushing’ is less open to interpretation than ‘aggression’).
    Categories should not overlap (e.g. ‘hugging’ and ‘cuddling’) and all possible behaviors should be included. 
    If categories are overlapping, or absent, different observers have to use their own judgement in deciding what to record and where, and may end up with inconsistent records.