Chapter 6: Validity

Created by

Rainier Postrero

Cards (30)

Validity - Is a term used in conjunction with the meaningfulness of a test score—what the test score truly means.
Validity - Is a judgment or estimate of how well a test measures what it purports to measure in a particular context.
Inference - Characterizations of the validity of tests and test scores are frequently phrased in terms such as “acceptable” or “weak”. It also refers to whether you can trust the conclusions of a study.
Validation - It is the process of gathering and evaluating evidence about validity. It's also the test developer’s responsibility to supply validity evidence in the test manual.
Face validity - Concept of validity that relates more to what a test appears to measure to the person being tested than to what the test actually measures.
Content validity - Concept of validity that evaluates how well an instrument (like a test) covers all relevant parts of the construct it aims to measure.
Test blueprint - A plan regarding the types of information to be covered by the items, the number of items tapping each area of coverage, the organization of the items in the test, and so forth.
Criterion related validity - Is a judgment of how adequately a test score can be used to infer an individual’s most probable standing on some measure of interest—the measure of interest being the criterion.
Criterion - The standard against which a test or a test score is evaluated; a standard on which a judgment or decision may be based.
Concurrent validity - A form of criterion-related validity that is an index of the degree to which a test score is related to some criterion measure obtained at the same time.
Predictive validity - A form of criterion-related validity that is an index of the degree to which a test score predicts some criterion measure.
Base rate - the extent to which a particular trait, behavior, characteristic, or attribute exists in the population (expressed as a proportion).
Hit rate - May be defined as the proportion of people a test accurately identifies as possessing or exhibiting a particular trait, behavior, characteristic, or attribute.
Miss rate - May be defined as the proportion of people the test fails to identify as having, or not having, a particular characteristic or attribute.
False positive - Is a miss wherein the test predicted that the test taker did possess the particular characteristic or attribute being measured when in fact the test taker did not.
False negative - Is a miss wherein the test predicted that the test taker did not possess the particular characteristic or attribute being measured when the test taker actually did.
Judgments of criterion-related validity, whether concurrent or predictive, are based on two types of statistical evidence:
Validity coefficient
Incremental coefficient
Validity coefficient - Is a correlation coefficient that provides a measure of the relationship between test scores and scores on the criterion measure.
Incremental coefficient - The degree to which an additional predictor explains something about the criterion measure that is not explained by predictors already in use.
Construct validity - Is a judgment about the appropriateness of inferences drawn from test scores regarding individual standings on a variable called a construct.
A construct is an informed, scientific idea developed or hypothesized to describe or explain behavior.
Test bias - A factor inherent in a test that systematically prevents accurate, impartial measurement.
Types of measurement bias:
Intercept bias
Slope bias
Intercept bias - Occurs when the use of a predictor results in consistent underprediction or overprediction of a specific group’s performance or outcomes.
Slope bias - Occurs when a predictor has a weaker correlation with an outcome for specific groups.
Rating error - Is a judgment resulting from the intentional or unintentional misuse of a rating scale.
Types of rating error:
Leniency error - An error in rating that arises from the tendency on the part of the rater to be lenient in scoring, marking, and/or grading.
Severity error - All other extremes.
Central tendency error - the rater, for whatever reason, exhibits a general and systematic reluctance to giving ratings at either the positive or the negative extreme.
One way to overcome what might be termed restriction-of-range rating errors (central tendency, leniency, severity errors) is to use rankings, a procedure that requires the rater to measure individuals against one another instead of against an absolute scale.
Halo effect - A tendency to give a particular ratee a higher rating than the ratee objectively deserves because of the rater’s failure to discriminate among conceptually distinct and potentially independent aspects of a ratee’s behavior.
Test fairness - The extent to which a test is used in an impartial, just, and equitable way.