psych 4 test dev

Cards (70)

  • Psychological Assessment and Test Development are umbrella terms for all that goes into the process of creating a test.
  • Test Conceptualization involves brain storming of ideas about what kind of test a developer wants to publish.
  • Questions to ponder on when conceptualizing for new tests include: What is the test designed to measure? What is the objective? Is there a need for this kind of test? Who will use the test? Who will take the test? What content will the test cover? How will the test be administered? What is the ideal format of the test? Should more than one form of test be developed? What special training will be required of test users for administering or interpreting the test? What types of responses will be required of test takers? Who benefits from an administration of this test? Is there potential harm?
  • Cumulative Scoring is the higher score one achieved on the test, the higher the test taker is on the ability that the test purports to measure.
  • Item branching is the ability of the computer to tailor the content and order of presentation of items on the basis of responses to previous items.
  • Item Difficulty is defined by the number of people who get a particular item correct.
  • Computerized Adaptive Testing refers to an interactive, computer administered test-taking process wherein items presented to the test taker are based in part on the test taker’s performance on previous items.
  • Subjecting test items to item banks, which are relatively large and easily accessible collection of test questions, can reduce the number of test items that need to be administered by 50% while simultaneously reducing measurement error by 50%.
  • Test Tryout should be tried out on people who are similar in critical respects to the people for whom the test was designed.
  • Empirical Criterion Keying involves administering a large pool of test items to a sample of individuals who are known to differ on the construct being measured.
  • The larger, the easier the item.
  • Item Endorsement Index is used for personality testing.
  • For achievement testing, the optimal average item difficulty is approximately 50%.
  • Semantic Differential Rating Technique measures an individual's unique, perceived meaning of an object, a word, or an individual; usually essay type, open-ended format.
  • A good test item is one that is answered correctly by high scorers as a whole.
  • An informal rule of thumb should be no fewer than 5 and preferably as many as 10 for each item.
  • Ipsative Scoring is comparing a test taker’s score on one scale within a test to another scale within that same test.
  • Pseudobulbar Affect is a neurological disorder characterized by frequent involuntary outburst of laughing or crying that may or may not be appropriate to the situation.
  • Responding succinctly with short answer essay items allows for creative integration and expression of the material, but focuses on a more limited area than can be covered in the same amount of time when using a series of selected-response items or completion items.
  • The test administered may be different for each test taker, depending on the test performance on the items presented.
  • The optimal average item difficulty for personality testing is approximately 70%.
  • Item Difficulty Index is the proportion of the total number of test takers who answered the item correctly.
  • Class Scoring or Category Scoring is when test taker responses earn credit toward placement in a particular class or category with other test takers who pattern of responses is presumably similar in some way.
  • Comparative and Categorical Scaling involves making comparisons or placing items into categories.
  • Selected-Response Format requires testtakers to select response from a set of alternative responses.
  • Method of Paired Comparisons produces ordinal data by presenting with pairs of two stimuli which they are asked to compare.
  • Comparative Scaling entails judgments of a stimulus in comparison with every other stimulus on the scale.
  • Guttman Scale yields ordinal-level measures.
  • Likert Scale is a scale for attitudes, usually reliable.
  • Age-based Scaling is when age is of critical interest.
  • Categorical Scaling involves placing stimuli into one of two or more alternative categories that differ quantitatively with respect to some continuum.
  • Item Format is the form, plan, structure, arrangement, and layout of individual test items.
  • Pilot Work/Pilot Study/Pilot Research refers to preliminary research surrounding the creation of a prototype of the test.
  • Completion Item requires the examinee to provide a word or phrase that completes a sentence.
  • Multidimensional Scaling presumes more than one dimension.
  • Item Pool is a reservoir or well from which the items will or will not be drawn for the final version of the test.
  • Constructed-Response Format requires testtakers to supply or to create the correct answer, not merely selecting it.
  • Multiple-Choice Format has three elements: stem (question), a correct option, and several incorrect alternatives (distractors or foils).
  • Scaling is the process of setting rules for assigning numbers in measurement.
  • Rating Scale is a grouping of words, statements, or symbols on which judgments of the strength of a particular trait are indicated by the testtaker.