reliability

Cards (22)

  • reliability is the extent to which a study produces consistent findings everytime it is done
    in psychology, for a study to be reliable we would expect the test to have the same result if carried out on another day by a different person
  • the issue
    • psychologists do not tend to measure concrete things, like grams or litres but are more interested in abstract concepts such as attitudes, aggression and memory
    • researchers need to have the same confidence in their psychological tests, observations and questionaires
    • test-retest reliability is used to assess reliability of self-report measures, IQ tests, personality tests and questionaires
    • the test is repeated after a short interval (e.g 2 weeks) and scores for each test
    • test-retest reliability is calculated as a correlation coefficient, a measure of the consistency of scores for each test
    • if the correlation coefficient is +0.8 or higher, it shows a strong positive relationship between the scores from the two tests. this means that over 80% of the time, the results from the second test closely match those from the first test
    • inter-interviewer reliability assesses the degree to which different interviewers provide consistent results when interviewing the same individuals
  • inter-interviewer reliability
    • a high correlation coefficient (e.g +0.80 or above) would indicate good inter-interviewer reliability
    • a low correlation coefficient would suggest the need for adjustments in the interview process
    • inter-oberver reliability (or inter-rater) is a way of assessing how reliable an observation is
    • you always need to have two or more observers carrying out an observation to avoid subjectivity bias and ensure reliability
    • observers need to watch the same event, or sequence of events, but record their data independently
    • if there is a correlation of +0.8 or more then there is high inter-observer reliability i.e that observers are observing and categorising behaviour consistently in more than 80% of the cases
  • inter-observer reliability practical
    total no. of agreements / total no. of observations0.80 = high inter-observer reliability
  • improving inter-observer reliability
    behavioural categories should be fully operationalised and measurable
    use specific categories (e.g pushing) to reduce ambiguity compared to vague terms (e.g aggression)
  • improving inter-observer reliability
    avoid overlapping categories (e.g hugging and cuddling) to maintain clarity
    ensure all possible behaviours are covered in the checklist to avoid gaps
  • improving inter-observer reliability
    if the categories are not operationalised well or if they overlap, different observers have to make their own judgements of what to record
  • improving test-retest reliability
    • a questionnaire that produces low test-retest reliability may require to be adjusted or rewritten
    • questions that are complex or ambiguous, may be interpreted differently by the same person on different occasions e.g "how satisfied are you with your current job in terms of salary, work environment and career growth opportunities"
  • improving test-retest reliability
    one solution might be to replace some of the open questions with fixed choice alternatives which may be less ambiguous e.g "on a scale of 1-10 how satisfied are you with your current salary?"
    by breaking down the question participants are more likely to provide consistent answers over time, thereby improving the test-retest reliability of the questionaire
  • improving inter-interviewer reliability
    • all interviewers should recieve the same comprehensive training on how to conduct the interviews
    • interviews should be structured in a standardised manner which can be followed by all to ensure consistency
  • improving inter-interviewer reliability
    • leading questions should be avoided at all times e.g "has your childhood experiences influence your ability to trust others?"
    • this is more easily avoided in a structured interview where the interviewer's behaviour is more controlled by the fixed questions
  • improving inter-interviewer reliability
    • leading questions is more easily avoided in a structured interview where the interview's behaviour is more controlled by the fixed questions e.g "what adjective would you use to describe your childhood experiences and why?"
    • interviews that are unstructured and 'free flowing' are less likely to be reliable
  • briefly outline one problem of using a single trained observer to rate the participants' driving skills in the practical task. briefly discuss how this data collection method could be modified to improve the reliability of the data collected (6)
    involve multiple trained observers to assess driving tasks
    this reduces impact of individual bias
    data would be less subject to any single observer's judgement
    observations could then be compared to ensure inter-rater reliability
  • briefly outline one problem of using a single trained observer to rate the participants' driving skills in the practical task. briefly discuss how this data collection method could be modified to improve the reliability of the data collected (6)
    use video recordings of the driving tasks
    ensures data is preserved, can be reviewed multiple times by a different observer
    allows for more accurate and consistent ratings
    as observers can revisit recordings to clarify uncertainties
    ensures evaluations
  • the researchers decided to analyse the data using a spearman's rho test. explain why this is a suitable choice of test for this investigation (3)
    it determines strength of relationship between 2 variables
    aligning w the researchers' initial aim to explore possible correlation between map reading skills and practical driving performance
    data are in related pairs, as each motorist is participating in both tasks
    both variables under test are ratings measured at ordinal level
    so it is appropriate for assessing the rank-order correlation between those 2 sets of data
  • calculated rho must equal or exceed the critical value for significance at the level shown (spearmans rho test)
  • after analysis of the data the researchers obtained a calculated value of rs = 0.808. using critical rho value of 0.700, what conclusion can the researchers draw about the relationship between the map reading and driving skills of the motorists
    (calculated rho) 0.808 > 0.700 (critical rho) where n=9
    results significant at 5% level for a two-tailed test so null hypothesis rejected, alternative hypothesis accepted
    means significant positive relationship between map reading ability and driving ability among the participants
    i.e drivers skilled at map reading are also skilled at driving