Reliability and validity

Created by

eloise allen

Cards (35)

What is reliability?
The consistency of a measure
What is external reliability?
Test-retest method: measure = consistent across time
Inter-rater reliability: measure = consistent across different individuals using it
What happens to the conclusions of a study if the study lacked reliability?
They can't be trusted -> lack of consistency = lack of objectivity = lack of validity
How can external reliability be assessed?
Test-retest method
Inter-rater reliability
What is the test-retest method?
Study conducted using measure -> researchers wait appropriate amount of time
Study repeated with same ppts + same measure -> researchers now have 2 sets of results
Sets of results run through statistical test measuring correlation -> must have strong positive correlation of 0.8+
What is inter-rater reliability?
Researchers get more than 1 person to independently observe + classify same behaviours of same ppts at same time
The 2 sets of results are run through a statistical test measuring correlation
Degree of inter-rater reliability found by comparing calculated value with correlation of critical value to determine likelihood of correlation being by chance
Must have a strong positive correlation of 0.8+
How can external reliability be improved?
Operationalisation of variables
Training to ensure inter-rater reliability
Careful transposition of data
Use of standardised instructions
What is operationalisation of variables?
Making the method of measuring co-variables/DV objective
Includes clear behavioural categories when conducting structured observations
Why does operationalisation of variables improve the reliability of variables?
More objective measure = more likely to be interpreted in the same way by researchers + ppts
Same items/behaviours counted no matter who uses the measure
= consistent data
What is training to ensure inter-rater reliability?
Providing education to people who will be counting behavioural categories in structured observations
Observers taught exactly what is meant by each category + given examples so they can recognise them
Every observer should have same training to ensure they all classify behaviours in same way
How does training to ensure inter-rater reliability improve reliability?
Each observer making judgements on same basis = results more likely to be objective
More objective judgements = similar classifications when faced with same stimulus
What is careful transposition of data?
Being meticulous when copying over raw results from data collection sheets for analysis = no numbers changed during process
How does careful transposition of data improve reliability?
Ensures that data being compared is actually the data that was collected
If they aren't, then the data's consistency can't be accurately assessed
What are standardised instructions?
Concise + unchanging set of instructions for ppts to follow during study
Standardised = no variation between ppts -> all ppts given exact same instructions
How do standardised instructions improve reliability?
Varying instructions between ppts = they could respond differently to measure used in study = behaviour may vary
Results would not remain consistent across ppts/same ppt at different times
What is internal reliability?
Measure is consistent within itself (all items/questions/etc measure same concept in same way)
What happens to the conclusions of a study if the study lacked internal reliability?
They can't be trusted -> lack of consistency = lack of objectivity = lack of validity
What is validity?
Accurately measuring what it is intended to measure
What are the the different types of internal validity?
Face validity
Concurrent validity
What is face validity?
Measure subjectively appears to accurately measure what it means to -> no obvious flaws
What is concurrent validity?
Correspondence to a previously established/accepted measure of same thing
What happens to the conclusions of a study if the study lacks internal validity?
They can't be trusted -> inaccurate measure = can't tell us anything about variable(s) being studied
How can internal validity be assessed?
Face validity
Concurrent validity
How is face validity assessed?
Researcher examines study closely -> no obvious problems = researcher can accept study's validity
Very weak test of validity!
How is concurrent validity assessed?
Researcher gets ppts to undertake the researchers' measure of behaviour then get previously established measure of same behaviour + same ppts to undertake second measure
2 sets of results then put through statistical test to test correlation -> degree of concurrent validity determined with critical value (strong positive 0.8+) = validity of measure is high
How can internal validity be improved? 
Generally -> tightly controlling study
Single blind technique -> ensure ppts unaware of study's aim
Double blind technique -> ensure both ppts AND researchers unaware of study's aim
What are the different types of external validity?
Ecological validity
Temporal validity
Population validity
What is ecological validity?
The extent to which findings from a measure/study can be generalised to real-life settings
What is temporal validity? 
The extent to which findings from a measure/study can be generalised to periods/eras
What is population validity?
The extent to which findings from a measure/study can be generalised to different individuals beyond the original sample
What happens to the conclusions of a study if the study lacks external validity?
Can't generalise findings outside of specific circumstances in which they were found
Results + conclusions only apply to that specific lab/time in history/group = conclusions drawn are very limited in explaining behaviour
How can ecological validity be assessed?
Researchers should replicate study + put both sets of results through statistical test measuring correlation -> if critical value = strong positive 0.8+, validity of measure is high
How can temporal validity be assessed?
Researchers should replicate study + put both sets of results through statistical test measuring correlation -> if critical value = strong positive 0.8+, validity of measure is high
How can population validity be assessed?
Researchers should replicate study + put both sets of results through statistical test measuring correlation -> if critical value = strong positive 0.8+, validity of measure is high
How can external validity be improved?
Ecological validity -> researchers should measure behaviour as analogous to real-life experience + lab studies supported with field studies
Temporal validity -> researchers should measure behaviour as not being context-bound
Population validity -> researcher should make sample representative of target population using stratified sampling (sub-groups should be represented proportionately)