Reliability and Validity

Cards (32)

  • Internal Validity- high internal validity is when it is a true/ accurate measure of what it has set out to measure
  • Face validity- does the measure appear to measure what it sets out to? (eg someone looks happy so we might assume that happiness is their valid/ actual state.)
  • An example of low face validity is asking participants to complete a word search to measure short term memory.
  • Construct Validity- Does the study measure the full range of components that make up a behaviour? (eg we would need to look at all the things that make up happiness, such as body language, contentment or optimistic outlook. We would not just look at whether they are smiling.)
  • An example of low construct validity is asking participants to complete some algebraic equations to measure maths ability. This does not assess their ability in other elements of maths such as graphs, statistics or geometry.
  • Criterion Validity- Does the measurement of a variable in one way relate to its measurement in another way?
  • Concurrent criterion validity- does a new measurement or test measure the same thing as an old one that has already been validated- do the results concur? (eg someone who passed the old driving test should also pass on the new one.)
  • An example of low concurrent criterion validity is if someone completed the Big 5 personality test and came out as highly extroverted and then tried a new shorter personality test and came out as highly introverted.
  • Predictive criterion validity- does the measurement of a behaviour predict performance on another related measurement? (eg someone who performed very well in their GCSEs should also perform very well in their A levels.)
  • An example of low predictive criterion validity is somebody who performs well in their GCSEs not performing well in their A levels.
  • Internal Validity- Face validity, Construct Validity, Criterion Validity (concurrent and predictive).
  • External Validity- high external validity is when the findings can be generalised beyond the sample/ initial study.
  • Population Validity- the extent to which the findings can be generalised to other people (the target population).
  • An example of low population validity is conducting a study into the effect of caffeine on sleep and using a sample of 50 undergraduate students. This is not representative of most adults so findings cannot be generalised to the wider population.
  • Ecological validity- The extent to which the findings can be generalised to real life situations.
  • An example of low ecological validity is a lab experiment such as taking part in a driving simulator. This environment is artificial and does not have the same surrounding as driving normally in a car. Therefore, we cannot generalise findings to driving in a physical car.
  • Temporal validity- The extent to which the findings can be generalised to different time periods.
  • An example of low temporal validity is that a study investigating attitudes towards TV advertising in the 70s and 80s may have very different findings if it was conducted today, due to media adverts becoming more common. The findings from the first study therefore lack temporal validity.
  • Correlations
    • If there are any problems with the measure of either variable it will reduce validity- eg when using scales or self-reporting answers participants are likely to give socially desirable answers.
    • Quantitative data may not give enough insight into behaviour in order for it to be a true measure.
  • Experiments
    • Low internal validity- uncontrolled extraneous variables
    • Low internal validity- if participants are aware they are being studied, they may respond to demand characteristics (usually in lab experiments).
    • High internal validity- if participants are unaware they are being studied, they will show true behaviour (usually in field experiments).
    • High internal validity- if extraneous variables are controlled and therefore reduced- a cause and effect can be established between the IV and DV
    • Ecological validity- the extent to which the setting reflects real life.
  • Self- Reports
    • Low internal validity- if there is a chance of social desirability, participants may be dishonest
    • Low internal validity- closed questions restrict how participants can answer
    • Low internal validity- any doubt or a lack of understanding of questions can lead to participants not giving their true answers
    • High internal validity- open questions mean participants are free to respond how they like
    • High internal validity- filler questions distract participants from working out the aim and reduce demand characteristics.
  • Observations
    Internal validity is influenced by the researcher's ability to clearly observe behavior, which can be impacted by participant or non-participant observations
    Ecological validity is impacted by if the environment is natural or artificial.
  • Observations
    • Low internal validity- Unclear or overlapping categories make it difficult to distinguish between behaviors and assign them to specific categories
    • Low internal validity- occurs with overt observations, as participants may respond to demand characteristics
    • High internal validity- Covert observations ensure participants are unaware of being observed, reducing demand characteristics and allowing for true behavior to be displayed
    • High internal validity- achieved when clear and fully operationalised categories result in valid measures of behavior
  • Internal Reliability- how consistent the procedure/ research is. It considers if all participants are experiencing the same thing, if they are being treated in a consistent way and the consistency within a test/ measure.
  • External Validity- Population validity, Ecological validity, Temporal validity.
  • How do we assess internal reliability?
    The split- half method
    • split the test in half
    • compare performance on one half with the performance on the other half
    • a strong positive correlation shows internal reliability
  • External Reliability- If the procedure has been repeated, are the results consistent? This means that if the test is repeated, we would expect the same results. If the test is high in internal reliability, we could repeat it to test for external reliability. You cannot comment on external reliability if the study has not been repeated.
  • How do we assess external reliability?
    The test-retest method
    • complete the test on more than one occasion
    • compare performance on the tests
    • A strong positive correlation indicates external reliability
  • Correlations
    • If the variables are measured in a standardised way and they are clearly operationalised, someone else could re-use the measure. This means high external reliability.
    • Have all participants been given the same controlled measure within the study?- internal reliability
  • Experiments
    • High internal reliability- having a standardised procedure means all participants have a consistent experience in the same experiment.
    • High internal reliability- standardised instructions so all participants get the same treatment.
    • High internal reliability- details on the procedure ensure all participants receive the same treatment.
    • Testing for external reliability- high levels of control mean the procedure can be replicated again with different participants.
  • Self- reports
    • If the questions are clear and standardised, all participants receive a consistent experience- high internal reliability
    • Repeat questions can be used to test for internal reliability
    • The split- half method can be used by taking scores from different halves of a questionnaire and comparing them- checking for internal reliability
    • Test re-test can be used by repeating the same questionnaire with the same or different participants to see if the same trends/patterns can be identified- testing for external reliability.
  • Observations
    • If behavioural categories are clear and standardised then other observers can note down behaviours in a consistent way during the procedure.
    • Inter rater reliability- checking for the consistency with the coding of behaviour during the observation
    • Pilot studies- these check for the consistency in the rating of behaviours and understanding of categories.
    • Time and time event sampling are better for future procedural repeating which would test for external reliability.