Progressive rise in intelligence score that is expected to occur on a normed intelligence test from the date when the test was first normed
Flynn effect
Gradual increase in the general intelligence among newborns
Frog Pond Effect - theory that individuals evaluate themselves as worse when in a group of high-performing individuals
Culture-Free
Attempt to eliminate culture so nature can be isolated
It is impossible to develop a culture-free test because culture is evident in its influence since birth of an individual and the interaction between nature and nurture is cumulative and not relative
Culture-Fair
Minimize the influence of culture with regard to various aspects of the evaluation procedures
Culture-Fair tests can be fair to all, fair to some, or fair only to one culture
Culture Loading
The extent to which a test incorporates the vocabulary, concepts, traditions, knowledge etc. with a particular culture
Classical Test Theory (True Score Theory)
The score on ability tests is presumed to reflect not only the test-taker's true score on the ability being measured but also the error
Error
The component of the observed test score that does not have to do with the test-taker's ability
Errors of measurement are random
The greater number of items, the higher the reliability
Factors that contribute to inconsistency
Characteristics of an individual, test, or situation, which have nothing to do with the attribute being measured, but still affect the scores
Error variance
Variance irrelevant random sources
Measurement error
All of the factors associated with the process of measuring some variable, other than the variable being measured
Difference between observed score and true score
Positive: can increase one's score
Negative: decrease one's score
Sources of error variance
Item sampling / Content sampling
Test administration
Test scoring and Interpretation
Random error
Source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in measurement process (e.g., noise, temperature, weather)
Systematic error
Source of error in measuring a variable that is typically constant or proportionate to what is presumed to be the true values of the variable being measured
Has consistent effect on the true score
SD does not change, the mean does
Error variance may increase or decrease a test score by varying amounts, consistency of test score, and thus, the reliability can be affected
Test-retest reliability
Error: time sampling
The longer the time passes, the greater likelihood that the reliability coefficient would be insignificant
Carryover effects - happened when the test-retest interval is short, wherein the second test is influenced by the first test because they remember or practiced the previous test = inflated correlation / overestimation of reliability
Practice effect - scores on the second session are higher due to their experience of the first session of testing
Test-retest with longer interval might be affected of other extreme factors, thus, resulting to low correlation
Target time for next administration: at least 2 weeks
Parallel forms / Alternate forms reliability
Error: Item sampling (Immediate), Item sampling changes over time (delayed)
Counterbalancing: technique to avoid carryover effects for parallel forms, by using different sequence for groups
Most rigorous and burdensome, since test developers create 2 forms of the test
Main problem: difference between the two tests
Test scores may be affected by motivation, fatigue, or intervening tests
Create a large set of questions that address the same construct then randomly divide the questions into 2 sets
Internal consistency (inter-item reliability)
Error: Item sampling homogeneity
Split-half reliability
Error: Item sample: Nature of split
Inter-scorer reliability
Error: Scorer differences
Standard error of measurement
Provides a measure of the precision of an observed test score
Standard deviation of errors as the basic measure of error
Index of the amount inconsistent or the amount of the expected error in an individual's score
Allows to quantify the extent to which a test provide accurate scores
Provides an estimate of the amount of error inherent in an observed score or measurement
Higher reliability, low SEM
Used to estimate or infer the extent to which an observed score deviates from a true score
Standard error of score
Confidence interval: a range or band of test scores that is likely to contain true scores
Standard error of the difference
Can aid a test user in determining how large a difference should be before it is considered statistically significant
Standard error of estimate
Refers to the standard error of the difference between the predicted and observed values
Four possible hit and miss outcomes
True positives (Sensitivity) - predicts success that does occur
True negatives (Specificity) - predict failure that does occur
False positive (Type 1) - success that does not occur
False Negative (Type 2) - predicted failure but succeed
Reactivity
When evaluated, the behavior increases
Hawthorne effect
Drift
Moving away from what one has learned going to idiosyncratic definitions of behavior
Subjects should be retained in a point of time
Contrast effect - Cognitive bias that distorts our perception of something when we compare it to something else, by enhancing the differences between them
Expectancies
Tendency for results to be influenced by what test administrators expect to find
Rosenthal / Pygmalion effect - Test administrator's expected results influences the result of the test
Leniency error - rater is lenient in scoring (Generosity error)
Severity error - rater is strict in scoring
Central Tendency error - rater's rating would tend to cluster in the middle of the rating scale
Halo effect - tendency to give high score due to failure to discriminate among conceptually distinct and potentially independent aspects of a ratee's behavior
Horn effect - opposite of halo effect
One way to overcome rating errors is to use rankings
Fundamental attribution error
Tendency to explain someone's behavior based on internal factors such as personality or disposition, and to underestimate the influence the external factors have on another person's behavior, blaming it on the situation
Barnum effect - people tend to accept vague personality descriptions as accurate descriptions of themselves (Aunt Fanny effect)
Bias
Factor inherent in a test that systematically prevents accurate, impartial measurement
Prejudice, preferential treatment
Prevention during test development through a procedure called Estimated True Score Transformation