Reliability

Cards (15)

  • Reliability is a measure of consistency. Generally, if a particular measurement is made twice and produces the same result then that measurement is described as being reliable.
  • Psychologists have developed ways of assessing whether their measurement tools are reliable. The most straightforward way of checking reliability is the test-retest method.
  • Test-retest reliability = a method of assessing the reliability of a questionnaire or psychological test by assessing the same person on two separate occasions. This shows to what extent the test produces the same answers.
  • There must be sufficient time between test and retest to ensure, say, that the participant/respondent cannot recall their answers to the questions to a survey but not so long that their attitudes, opinions or abilities may have changed.
  • In the case of a questionnaire or test, the two sets of scores would be correlated to make sure they are similar. If the correlation turns out to be significant then the reliability of the measuring instrument is assumed to be good.
  • Inter-observer reliability = the extent to which there is agreement between two or more observers involved in observations of behaviour. This is measured by correlating the observations of two or more observers.
  • One observer’s interpretation of events may different widely from someone else’s - introducing subjectivity, bias and unreliability into the data collection process.
  • The recommendation is that would-be observers should not ‘go it alone’ but instead conduct their observations in teams of at least two. However, inter-observer reliability must be established. This may involve a small-scale trial run of the observation in order to check that observers are applying behavioural categories in the same way, or a comparison may be reported at the end of the study.
  • Observers obviously need to watch the same event, or sequence of event, but record their data independently. As with the test-retest method, the data collected by the two observers should be correlated to assess its reliability.
  • Similar methods apply to other forms of observation, such as content analysis (though this would be referred to as inter-rater reliability) as well as interviews if they are conducted by different people ( known as inter-interviewer reliability).
  • Reliability is measure using a correlational analysis. In test-retest and inter—observer reliability, the two sets of scores are correlated. The correlation coefficient should exceed +80 for reliability.
  • The reliability of questionnaires over time should be measured using the test-retest method. Comparing two sets of data should produce a correlation that exceeds +80. A questionnaire that produces low test-retest reliability may require some of the items to be ‘deselected’ or rewritten. For example, if questions are complex or ambiguous, they may be interpreted differently by the same person on different occasions. ONS solution might be to replace some of the open questions (where there may be more room for misinterpretation) with closed, fixed-choice alternatives which may be less ambiguous.
  • For interviews, probably the best way of ensuring reliability is to use the same interviewer each time. If this is not possible or practical, all interviewers must be properly trained so, for example, one particular interviewer is not asking questions that are too leading or ambiguous. This is more easily avoided in structured interviewers where the interviewers’s behaviour is more controlled by the fixed questions. Interviews that are unstructured and more ‘free-flowing’ are less likely to be reliable.
  • Reliability of observations can be improved by making sure behavioural categories have been properly operationalised, and are measurable and self-evident. Categories shouldn’t overlap and all possible behaviours should be covered in the checklist. If categories are not operationalised well, or are overlapping or absent, different observers have to make their own judgements of what to record where and may end up with differing and inconsistent records. If reliability is low, then observers may need further training in using the behavioural categories and may wish to discuss their decisions
  • In an experiment it is the procedures that are the focus of reliability. In order to compare the performance of different participants (as well as comparing the results from different studies) the procedures must be the same (consistent) every time. Therefore in terms of reliability an experimenter is concerned about standardised procedures.