Cards (45)

  • Test
    Scientific (routine) method for recording one or more psychological features. Aim: Quantitative statement about the degree of individual feature expression
  • Test Theory
    Formal and statistical requirements that a test must meet in order for test results to be able to infer the actual expression of the tested feature
  • Content Area
    • Intelligence
    • Personality
    • Attitude
    • School Performance
    • Leadership Behavior
  • Type of Collection
    • Questionnaire (Intelligence Test, Team Climate Inventory)
    • Behavioral Observation (In-basket exercise in Assessment Center)
    • Projective Tests (Thematic Apperception Test, TAT)
    • Objective Test (Skin Conductance)
  • Speed Tests

    Limited time for task response (e.g. concentration tests)
  • Power Tests
    Tasks gradually become more difficult (e.g. intelligence)
  • Unidimensional Tests

    Recording one feature
  • Multidimensional Tests

    Recording several features (e.g. multiple scales in personality test)
  • Method of Collection
    Paper & Pencil vs. Computer-based
  • Example item from Ravens Matrices
    • Multiple choice
  • D2 Test
    • Detecting attention and concentration deficits
    • Assessing executive function in clinical populations (e.g., those with ADHD, traumatic brain injury, stroke, dementia)
    • Research studies investigating attention and processing speed
    • Educational and occupational settings where sustained attention and focus are critical
  • D2 Test
    1. The test consists of the letters "d" and "p" arranged in 14 rows, each containing 57 (previously 47) characters, and marked with 1 to 4 lines above and/or below
    2. Your job is to cross out as many "d" letters marked with 2 lines as possible in each row within 20 seconds, without making any omission or substitution errors
  • Projective Tests
    • LM Grid for recording Achievement Motivation - He feels comfortable with it. - He thinks, "If this is difficult, I'd rather continue another time." - He believes that he will manage it. - He thinks, "I am proud of myself because I can do this."
  • Item Analysis
    Looks at the properties of the measurements on a single item: Difficulty, Homogeneity (Variance), Discrimination index
  • Item Difficulty
    Expresses how difficult a task was for individuals in a sample in terms of the construct captured. The difficulty index is equal to the percentage share of the "correct" answers for this task in an analysis sample.
  • Item Difficulty Example
    • For responses to the item "I don't like myself", on a five-level scale, the mean value in a sample of 399 people is 0.79. This is a rather difficult item.
  • Item Discrimination
    An item is discriminatory if it differentiates individuals with a higher characteristic expression (with a higher overall test score) from those with a lower expression of the characteristic. The item discrimination is described by the relationship (correlation) between the item value and the overall test score.
  • Example of Item Discrimination
    • Social Competence Scale, Item 7: "I am capable of adapting my behavior and becoming the person that the situation requires." Corrected discrimination ri,t-i = .45 (uncorrected rit = .62)
  • Item Difficulty and Discrimination
    Variance restriction for extreme difficulties restricts maximum/minimum possible correlation. The discriminatory power for extreme items is therefore often lower.
  • Item Homogeneity (Variance)
    Item variance indicates how much the responses to an item vary (average squared deviation from the mean). Variance of 0: all people answer the item the same → no differentiation. The lower the variance, the lower the maximum/minimum possible correlation (between item value and overall test score).
  • Item Selection Strategies
    • Based on Difficulty: Range of difficulties: .20 < p < .80, Average difficulty p ~ .50
    • Based on Discriminative Power: Only expected (positive) direction, Should be >= .30
    • Based on Content Considerations: Removal despite positive statistics due to distant construct content, Retention despite poor statistics due to close construct content/representation
  • Item Selection Example
    • Do you agree that group behavior impacts individual decision-making? (Difficulty: .61, Discrimination: .65)
    True or False: Society can't function without conformity to social norms. (Difficulty: .58, Discrimination: -.10)
    Rate your agreement: Social psychology theories are more applicable to daily life than other psychology theories. (Difficulty: .25, Discrimination: .40)
  • The item characteristics assist in the construction of a test.
  • Iterative process of test construction
    1. Formulation of items (more than planned)
    2. Data collection/pre-testing on a sample
    3. Calculation and analysis of item characteristics
    4. Selection of items
    5. New compilation of the test from the remaining items
    6. Data collection on a norm sample
    7. Confirmation of the test structure - Factor Analysis
  • Test Analysis: Assessment of Test Quality
    • Secondary Quality Criteria: Utility, Standardization, Practicability / Economy, Social validity
    • Primary Quality Criteria: Objectivity, Reliability, Validity
  • Discrimination index

    A measure of how well an item differentiates between high and low performers
  • The item's discrimination index of .40 is decent, suggesting that this item also differentiates effectively between high and low performers
  • Interim conclusion
    The item characteristics assist in the construction of a test
  • Secondary quality criteria
    • Utility
    • Standardization
    • Practicability / Economy
    • Social validity
  • Primary quality criteria
    • Objectivity
    • Reliability
    • Validity
  • Utility
    A test should improve prediction beyond existing knowledge
  • Standardization
    Creation of a reference system for interpreting test scores
  • Objectivity
    The extent to which the test result is independent of the test user
  • Reliability
    Measurement accuracy or measurement error freedom of an instrument
  • Reliability estimation methods
    • Retest-reliability
    • Parallel test reliability
    • Split-half reliability
    • Cronbach's Alpha
  • Cronbach's Alpha
    Measures the internal consistency of a test or survey
  • Generally, a Cronbach's Alpha of 0.7 or above is considered acceptable in most research situations, indicating a reasonable level of internal consistency
  • Reliability estimates for the Bochumer Inventory for Job-related Personality Description (BIP)
  • Validity
    The degree of accuracy with which the test measures what it is supposed to measure
  • Traditional facets of validity
    • Content validity
    • Criterion-related validity
    • Construct validity