An umbrella term for all that goes into the process of creating a test.
Test development
The process of developing a test occurs in five stages:
Test conceptualization
Test construction
Test tryout
Item analysis
Test revision
It is the process of setting rules for assigning numbers in measurement.
Scaling
He is credited for being the forefront of efforts to develop methodologically sound scaling methods.
L.L Thurstone
Entails judgments of a stimulus in comparison with every other stimulus on the scale
Comparative scaling
Stimuli are placed into one or two more alternative categories that differ quantitatively with respect to some continuum.
Categorical scaling
Grouping of words, statements, or symbols on which judgments of the strength of a particular trait, attitude, or emotion are indicated by the test taker.
Rating scale
When final score is obtained by summing the ratings across all the items.
Summative scale
Each item presents the test taker with five alternative responses usually on agree-disagree, or approve-disapprove continuum.
Likert scale
It is when presented with two stimuli and asked to compare.
Method of paired comparisons
Judging of a stimulus in comparison with every other stimulus on the scale.
Comparative scaling
Test taker places stimuli into a category; those categories differ quantitatively on a spectrum.
Categorical scaling
Items range from sequentially weaker to stronger expressions of attitude, belief, or feeling.
Guttman scale (Scalogram analysis)
Reservoir from which items will not be drawn for the final version of the test.
Item pool
Variables such as the form, plan, structure, arrangement and layout of individual test items.
Item format
Type of format where test taker selects a response from a set of alternative responses.
Selected-response format
Type of format where test taker supplies or creates the correct answer.
Constructed-response format
Relatively large and easily accessible collection of test questions.
Item bank
Interactive, computer administered test taking process wherein items presented to the test taker are based in part on test taker's performance on previous.
Computerized Adaptive Testing (CAT)
The diminished utility of an assessment tool for distinguishing test takers at the low end of the ability, trait, or other attribute being measured.
Floor effect
The diminished utility of an assessment tool for distinguishing test takers at the high end of the ability, trait, or other attribute being measured.
Ceiling effect
Ability of computer to tailor the content and order of presentation of test items on the basis of responses to previous items.
Item branching
Test takers earn cumulative credit with regard to a particular construct.
Cumulative scoring
Test taker responses earn credit toward placement in a particular class or category with other test takers whose patter of responses is presumably similar in someway.
Class/category scoring
Comparing a test taker's score on one within a test to another scale within that same test.
Ipsative scoring
Obtained by calculating the proportion of the total number of test takers who answered the item correctly "p".
Item difficulty index
Indication of the internal consistency of a test.
Item reliability index
Statistic designed to provide an indication of the degree to which a test is measuring what it purports to measure.
Item validity index
Measures how adequately a item separates or discriminates between high scores and low scorers.
Item discrimination index
Other Considerations in Item Analysis
Guessing
Item fairness
Speed tests
Techniques of data generation and analysis that rely primarily on verbal rather than mathematical or statistical procedures.
Qualitative method
Various nonstatistical procedures designed to explore how individual test item work.
Qualitative item analysis
Approach to cognitive assessment that entails respondents vocalizing thoughts as they occure.
Think aloud test administration
Revalidation of a test on a sample of test takers other than those on whom test performance was originally found to be a valid predictor of some criterion.
Cross validation
Decrease in item validities that inevitably occurs after cross validation of finding.
Validity shrinkage
Test validation process conducted on two or more tests using the sample of test takers.
Co-validation
When co-validation is used conjunction with the creation of norms or the revision of existing norms.
Co-norming
Phenomenon wherein an items function differently in one group of test takers as compared to another group of test takers known to have the same level of the underlying trait.
Differential item functioning
Items range from sequentially weaker to stronger expressions of attitude, belief, or feeling.