the ability of a measure to produce the same or similar results on repeated administrations
internal reliability and external reliability
internal reliability
the extent to which a measure is consistent within itself
split-half reliability:
compares results of one half of a test with the other half
split items into two groups and correlate them
ppt scoring high on one half should score high on the other half
is dependent on how test was split. good questions could have been at the start
Cronbach’s alpha
a test that splits items equally in every way possible
then correlates all halves with all other halves
much more robust measure of internal reliability
should have a high correlation coefficient >.70
R gives info abt how each item correlates w all other items
how alpha would change if an item was removed
the higher the alpha, the more reliable the questionnaire
external reliability
the extent to which a measure varies one use to another e.g. IQ test, want high test retest reliability
test-retest reliability:
the stability of a test over time
a good test consistently reliable
Administer test now, then give it again later to same ppt
A good test will have high correlation
Ppt that score high on T1 should score high on T2
inter-rater reliability:
usually used n observational studies
raters assigned behaviour to different classes
degree to which different raters give consistent estimates of the same behaviour
correlation to check reliability (or Cohen’s Kappa)
improve -> clear categories/definitions, training
Improving reliability in general
improve quality of items, clear/umambiguous
Increase/decrease item no.
Increase sample size, control ind dif.
Choose appropriate sample, target population
Control conditions
sensitivity (power)
want exp to detect even a small effect of the IV on DV.
large samples
varied effects - choose DV carefully
control unwanted variability
Task
Not too hard or too easy, want a wide range of scores
properties of sample
want it to representative of target population, sample should be appropriate to the research question. should be large. reduce unwanted variation. control conditions, keep things constant across ppt.
choose tasks to ensure a range of scores. right level of difficulty to maximise score variation, right no of questions
validity
Whether test actually measures what it claims to be measuring.
face validity
Content validity
Construct validity
Criterion validity
face validity
whether test appears to measure what it claims to be measuring
content validity
does it cover the full range of symptoms of a construct
construct validity
the degree to which a test measures the construct/psychological concept at which its aimed
criterion validity
whether a test reflects a certain set of abilities i.e. the degree to which a measure ment can accurately predictspecific criterion variables
validity of a study/experiment
external
internal
ecological
external
the extent to which the results can be generalised to different populations, settings and conditions
how to ensure high external:
extend to new ppl/situations
have high construct validity
use a representative sample
replicate w new groups
internal
When we can be confident that manipulating the IV affects DV - there is a causal rs