The process of setting rules for assigning numbers in measurement; process by which a measuring device is designed and calibrated and by which numbers (or other indices) – scale values – are assigned to different amounts of the trait, attribute or characteristic being measured
Indication of the degree to which a test is measuring what it purports to measure; the higher the index, the greater the test's criterion-related validity
Indication of how adequately an item separates or discriminates between high scorers and low scorers on an entire test; the higher the value, the more adequately the item discriminates
Preliminary questions in TEST CONCEPTUALIZATION: (p2)
What is the ideal format of the test?
Should more than one form of the test be developed?
What special training will be required of the test users for administering or interpreting the test?
What types of responses will be required of testtakers?
Who benefits from an administration of this test?
Is there any potential for harm as the result of an administration of this test?
How will meaning be attributed to scores on this test?
Preliminary questions in TEST CONCEPTUALIZATION:
What is the test designed to measure?
What is the objective of the test?
Is there a need for this test?
Who will use this test?
Who will take the test?
What content will the test cover?
How will the test be administered?
Scaling methods
• Assignment of numbers to responses so that a test score can be calculated
Rating scale
– a grouping of words, statements, or symbols on which judgments of the strength of a particular trait, attitude, or emotion are indicated by the testtaker.
Summative scale
– summing ratings across all the items to obtain the final test score
Method of paired comparisons
– testtakers are presented with pairs of stimuli which they are asked to compare; selection of one stimuli is according to some rule
Comparative scaling
– entails judgments of a stimulus in comparison with every other stimulus on the scale
Categorical scaling
– stimuli are placed into one of two or more alternative categories that differ quantitatively with respect to some continuum
Guttman scale/Scalogram analysis
– items range sequentially from weaker to stronger expressions of the attitude, belief, or feeling being measured
Matching item
• Testtaker is presented with two columns: premises on the left and responses on the right
• The task is to determine which response is best associated with which premise
Binary-choice item (true-false item)
• Takes the form of a sentence that requires the testtaker to indicate whether the statement is or is not a fact
*TYPES OF CONSTRUCTED-RESPONSE ITEMS
Completion item
– requires the examinee to provide a word or phrase that completes a sentence
*TYPES OF CONSTRUCTED-RESPONSE ITEMS
Short-answer item
– requires a succinct response
*TYPES OF CONSTRUCTED-RESPONSE ITEMS
Essay
– requires the testtaker to respond to a question by writing a composition, typically one that demonstrates recall of facts, understanding, analysis, and/or interpretation
*Scoring items
Cumulative model
– cumulative credit with regard to a particular construct