Reliability refers to how consistent a measure is. If a test, observation, or study is reliable, it will produce similar results under consistentconditions. The two types of reliability are test-retest reliability and inter-observer reliability.
Test-Retest Reliability involves testing the same person with the same test on two differentoccasions to see if the results are consistent over time (same results or very similar).
How it works: If the scores strongly correlate (e.g. r ≥ +0.80), the test is reliable.
Used for: To assess the reliability of a questionnaires, psychologicaltests, etc.
Inter-Observer Reliability (or Inter-Rater Reliability) is the degree to which two or moreobserversagree when watching the samebehaviour.
Why it matters: Observations are often subjective, so you need agreement to ensure consistency.
How it works: Observers independentlycode the behaviour, then compare their results.
A correlation coefficient of +0.80 or above = strong agreement = highinter-observerreliability
How to improve inter-observer reliability:
Train observers with a detailedbehaviouralchecklist (operationalised categories)
Do a pilot study to refine coding
Internal Reliability: The extent to which allitems in a test measure the same thing.
It is tested using the Split-half method:
Split the test in two (e.g., odd vs even questions)
Correlate the two halves
High correlation = consistent internal structure
To improve reliability in questionnaires:
Standardisequestions: replace open questions with closed, fixed-choice questions
Remove ambiguous or complex questions: leaves less room for (mis)interpretation
To improve reliability in interviews:
Use structured interviews: same questions asked, avoiding leading or ambiguous questions
Use the sameinterviewer each time or properly trained one
To improve reliability in experiments:
Strict control of conditions: more achievable in lab, than field experiment
Use standardised procedures: all participants tested under same conditions
To improve reliability in observations:
Operationalisebehaviouralcategories properly: categories are clear, measurable and do not overlap, and all possible behaviours should be covered
Train observers well: trained to spot desired behaviours and record them accurately to provide consistent results