PA #10

Cards (21)

  • Standards for Educational and Psychological Testing (SEPT)

    Definitive technical, professional and operational standards for all forms of assessments that are professionally developed and used in a variety of settings
  • Professional organizations that promulgated the SEPT standards
    • American Educational Research Association (AERA)
    • American Psychological Association (APA)
    • National Council on Measurement in Education (NCME)
  • The SEPT standards were last published in 1999 and the latest edition is from 2014
  • Parts of the SEPT standards

    • Part I: Foundation
    • Part II: Operations
    • Part III: Testing Applications
  • Test Construction

    1. Identify major objectives
    2. Identify population
    3. Indicate possible conditions and uses
    4. Planning of the test
    5. Item writing
    6. Preliminary tryout
    7. Proper tryout
    8. Final tryout
    9. Reliability
    10. Validity
    11. Norms
    12. Revising the test
    13. Publishing the test
  • Item
    A single question or task that is not often broken down into any smaller units
  • Characteristics of a good item

    • Clarity - no ambiguity
    • Moderately difficult
    • Discriminating power
    • To the point - measures only significant aspects
    • Not encourage guesswork
    • Clear in reading
    • Independent for its meaning
  • Preliminary tryout
    1. Find out inadequacies, weaknesses, omissions, ambiguities
    2. Difficulty level of each item
    3. Time limit of test
    4. Length of test
    5. Standard directions
  • Proper tryout

    1. Difficulty index
    2. Discrimination index
    3. Effectiveness of distractors
  • Final tryout
    Provides a final check on administration and time limit
  • After final tryout, expert opinion should be considered again
  • Revising the test

    1. Adding, deleting, modifying items
    2. Additional trial administrations
    3. Standardizing length, sequencing, administration, scoring
  • Standardized tests are revised over time due to obsolescence of norms, performance criteria, and test content
  • Publishing the test

    1. Writing administration instructions for administrators
    2. Writing instructions for test takers
  • For a test to be published commercially, the process may take years and be repeated several times
  • The results must be psychometrically defensible in terms of item characteristics, validity, and reliability
  • Test Development

    1. Item development
    2. Pilot test
    3. Revision
    4. Standardization
    5. Finalization
  • Item Difficulty
    The proportion or percentage of examinees who answer the item correctly
  • Range of item difficulty

    • 0.00-0.09 very difficult
    • 0.10-0.24 difficult
    • 0.25-0.75 average
    • 0.76-0.90 easy
    • 0.91-1.00 very easy
  • Item Discrimination
    Determines how well the question can tell the difference between high and low performers
  • Ranges of item discrimination

    • 0.40-0.60 good items
    • 0.30-0.39 reasonably good items
    • 0.20-0.29 marginal items
    • Less than 0.20 poor items
    • 0.00 everybody performing the same
    • Negative discrimination - low performers better than high performers