A judgment or estimate of how well a test measures what it purports to measure in a particular context
Validity
A judgment based on evidence about the appropriateness of inferences drawn from test scores
Characterizations of the validity of tests and test scores are frequently phrased in terms such as "acceptable" or "weak"
Validity
A judgment of how useful the instrument is for a particular purpose with a particular population of people
No test or measurement technique is "universally valid" for all time, for all uses, with all types of test-taker populations
Tests may be shown to be valid within what we would characterize as reasonable boundaries of a contemplated usage
The validity of a test may have to be re-established as the culture or the times change, with the same as well as other test-taker populations
Validation
The process of gathering and evaluating evidence about validity
It is the test developer's responsibility to supply validity evidence in the test manual
It may sometimes be appropriate for test users to conduct their own validation studies with their own groups of test-takers
Local validation studies are absolutely necessary when the test user plans to alter in some way the format, instructions, language, or content of the test
This may yield insights regarding a particular population of test-takers as compared to the norming sample described in a test manual
Three categories of validity
Content validity
Criterion-related validity
Construct validity
Content validity
A measure of validity based on an evaluation of the subjects, topics, or content covered by the items in the test
Criterion-related validity
A measure of validity obtained by evaluating the relationship of scores obtained on the test to scores on other tests or measures
Construct validity
A measure of validity that is arrived at by executing a comprehensive analysis of how scores on the test relate to other test scores and measures, and how scores on the test can be understood within some theoretical framework for understanding the construct that the test was designed to measure
Construct validity is referred to as the "umbrella validity" because every other variety of validity falls under it
Trinitarian approaches to validity assessment are not mutually exclusive. All three types of validity evidence contribute to a unified picture of a test's validity
A test user may not need to know about all three types of validity
Ecological validity
A judgment regarding how well a test measures what it purports to measure at the time and place that the variable being measured (typically a behavior, cognition, or emotion) is actually emitted
Face validity
Relates more to what a test appears to measure to the person being tested than to what the test actually measures
Face validity
A judgment concerning how relevant the test items appear to be
Judgments about face validity are frequently thought of from the perspective of the test-taker, not the test user
A test that lacks face validity may still be relevant and useful
If the test is not perceived as relevant and useful by test-takers, parents, legislators, and others, then negative consequences may result
Face validity may be more a matter of public relations than psychometric soundness
Content validity
A judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample
The clarity of test developers' vision of the construct being measured can be reflected in the content validity of the test
In the interest of ensuring content validity, test developers strive to include key components of the construct targeted for measurement, and exclude content irrelevant to the construct targeted for measurement
Assertiveness
You can give an opinion or say how you feel
You can ask for what you want or need
You can disagree respectfully
You can offer your ideas and suggestions
You can say no without feeling guilty
You can speak up for someone else
Making eye contact
Taking accountability for your own mistakes
Making sure everyone is on board with a decision
Taking pride in yourself and your team
From the pooled information (along with the judgment of the test developer), there emerges a test blueprint for the "structure" of the evaluation
Tests are often thought of as either valid or not valid, but validity is relative to culture and context
Criterion-related validity
A judgment of how adequately a test score can be used to infer an individual's most probable standing on some measure of interest—the measure of interest being the criterion
Concurrent validity
An index of the degree to which a test score is related to some criterion measure obtained at the same time (concurrently)
Predictive validity
An index of the degree to which a test score predicts some criterion measure
Criterion
The standard against which a test or a test score is evaluated
There are no hard-and-fast rules for what constitutes a criterion
Characteristics of an adequate criterion
Relevant
Valid for the purpose
Uncontaminated
Criterion contamination is the term applied to a criterion measure that has been based, at least in part, on predictor measures
Concurrent validity
If test scores are obtained at about the same time as the criterion measures are obtained, measures of the relationship between the test scores and the criterion provide evidence of concurrent validity