Psychological Assessment 2

    Cards (42)

    • Test
      A tool to measure a particular construct
    • Test development
      1. Test Conceptualization
      2. Test Construction
      3. Test tryout
      4. Item Analysis
      5. Test Revision
    • Test Conceptualization
      • Test developers' idea of developing a tool to measure a particular construct
      • The stimulus for developing a test can be anything (e.g. emergence of a social phenomenon)
    • Pilot work
      Preliminary research surrounding the creation of a prototype of the test<|>Involves creation, revision, and deletion of test items
    • Test Construction
      1. Scaling
      2. Writing Items
      3. Item Formats
      4. Scoring Items
    • Scaling
      The process of setting rules for assigning numbers in measurement
    • Types of scales
      • Age scale
      • Grade scale
      • Stanine scale
    • Likert scale
      Used to scale attitudes, presents test taker with five alternative responses on an agree/disagree or approve/disapprove continuum
    • Item writing
      Considerations: range of content to cover, item formats to employ, number of items to write
    • For a standardized test, the first draft usually contains approximately twice the number of items that the final version will contain
    • Item formats
      Selected response (multiple choice, matching, true/false)<|>Constructed response (completion, short answer, essay)
    • Scoring models
      Cumulative model (higher score = higher ability)<|>Class model (placement in a particular class/category)<|>Ipsative scoring (comparison of a test taker's scores on different scales)
    • Test Tryout
      1. Test is tried out on the sample for which it is constructed
      2. Conditions should be as similar as possible to standardized test administration
    • Characteristics of a good test item
      • Valid and reliable
      • Discriminates test takers (high scorers get it right, low scorers get it wrong)
    • Item Analysis
      1. Employs statistical procedures to select the best items from a pool of tryout items
      2. Considers item difficulty, item-validity index, item-reliability index, item discrimination index
    • Item difficulty index
      Proportion of total test takers who answered the item correctly
    • Item-reliability index
      Indication of the internal consistency of a test
    • Item-validity index

      Indication of the degree to which a test is measuring what it purports to measure
    • Test Revision
      1. Eliminate and rewrite items based on item analysis
      2. Balance strengths and weaknesses across items
      3. Administer revised test under standardized conditions
    • The process of developing a test occurs in five stages: Test Conceptualization, Test Construction, Test tryout, Item Analysis, and Test Revision
    • Test Revision
      Information gathered at item-analysis stage<|>Some items eliminated<|>Others re-written<|>Characterize each item's strengths and weaknesses<|>Balance strengths and weaknesses across items<|>Administer revised test under standardized conditions<|>Consider test in finished form based on item analysis
    • If many otherwise good items tend to be somewhat easy, the test developer may purposefully include some more difficult items
    • Having balanced all the concerns, the test developer comes out of revision stage with a test of improved quality
    • Forms of Reliability
      • Test-Retest Reliability
      • Parallel Forms Reliability
      • Inter-rater Reliability
      • Split-Half Reliability
    • Reliability
      Consistency of scores obtained by the same person when re-examined with the same test on different occasions, or with different sets of equivalent items, or under other variable examining condition
    • Test-Retest Reliability

      Comparing scores obtained from two successive measurements of the same individuals and calculating a correlation between the two sets of scores<|>Measures error associated with administering a test at two different times<|>Only applicable to stable traits
    • Parallel Forms Reliability
      At least two different versions of the test yield almost the same scores<|>Compares two equivalent forms of a test that measure the same attribute
    • Inter-rater Reliability
      Degree of agreement between two observers who simultaneously record measurements of the behaviors
    • Split-Half Reliability
      Obtained by splitting the items on a questionnaire or test in half, computing a separate score for each half, and then calculating the degree of consistency between the two scores for a group of participants
    • The test can be divided according to the odd and even numbers of the items (odd-even system)
    • Validity
      Degree to which the measurement procedure measures the variable that it claims to measure (strength and usefulness)
    • Forms of Validity
      • Face Validity
      • Content Validity
      • Criterion Validity
      • Construct Validity
    • Face Validity
      Simplest and least scientific form of validity, demonstrated when the face value or superficial appearance of a measurement measures what it is supposed to measure
    • Content Validity
      Concerned with the extent to which the test is representative of a defined body of content consisting of topics and processes<|>Not done by statistical analysis but by the inspection of items by a panel of experts
    • Criterion Validity

      Involves the relationship or correlation between the test scores and scores on some measurement representing an identical criterion
    • Predictive Validity

      Demonstrated when scores obtained from a measure accurately predict behavior (criterion) according to a theory
    • Concurrent Validity

      Established when the scores of a measure (predictor) is correlated with the scores of a different measure (criterion) taken at the same time
    • Construct Validity
      Requires that the scores obtained from a measurement procedure behave exactly the same as the variable/construct itself<|>Based on many research studies that use the same measurement procedure and grows gradually as each new study contributes more evidence
    • Convergent Validity

      Involves comparing two different methods to measure the same construct and it is demonstrated by a strong relationship between the scores obtained from the two methods
    • Divergent Validity

      Refers to the demonstration of the uniqueness of that test<|>Effectively demonstrated when a test has a low correlation with measures of unrelated constructs
    See similar decks