chapter 4

Cards (27)

  • what is classical test score theory?
    it assumes that each person has a true score that would be obtained if there were no errors in measurement; the score observed for each person almost always differs from the person's true ability or characteristic.
  • Low standard error of estimate means?
    high accuracy
  • what is domain sampling?
    considers problems created by using limited number of items to represent a larger and more complicated construct. (e.g., evaluate spelling ability; to accomplish evaluation a sample of words is used; the number correct would be your true score).
  • greater number of items means?
    higher reliability; reliability can be estimated from the correlation of the observed test score with the true score. (you can NEVER see someone's true score!)
  • what is item response theory?
    increase reliability by adding more items; based on computer adaptive testing (getting and item correct so the computer gives a slightly more difficult items, getting an item wrong so the computer gives a slightly easier item)
  • what are different sources of error on tests?
    situational factors; loud noises, room is hot or cold / parallel forms; evaluate test across different forms of the test. / method of internal consistency; examine how people perform on similar subsets of items from the same form of the measure.
  • what is test retest?
    take a test; take the same test again; if the test is reliable the scores should be the same. (when measuring traits or characteristics that don't change over time.)
  • what are carry over effects?
    taking a test once; the knowledge from taking it the first time carries over to the second time.
  • carry over effects do what?
    inflate reliability; "can make a test look more reliable than it actually is." / can be reduced by increasing time intervals between tests so it decreases possibility of information carry over.
  • what are practice effects?
    some skills improve with practice, when the test is given the second time the score is better because the subject has 'practiced' by taking it the first time.
  • tests that measure constantly changing characteristics are?
    not appropriate for test retest method
  • what is the split half method?
    FOR ABILITY TESTS; splitting a test in half and comparing the results with each other (if they are measuring the same thing the two halves should correlate with each other).
  • in split half method; what does the spearman brown formula allow?
    allows you to estimate what the correlation between the two halves would have been if each half had been the length of the whole test. (increases estimate of reliability)
  • what is coefficient alpha?
    FOR PERSONALITY TESTS; no right or wrong answer (continuous items --> strongly disagree to strongly agree, 1 to 5 scales.) e.g., personality or attitude scales (attitude questionnaire) / the most general method of finding estimates of reliability through internal consistency.
  • what is internal consistency?
    the degree to which items on a test or measure are correlated with each other.
  • what do interrater, inter scorer, inter observer, or inter judge reliability all mean?
    consistency among different judges who are evaluating the same behavior.
  • in behavioral observation; observations are?
    observations are loaded with error --> some behaviors can be missed, not everyone classifies one behavior the same way.
  • what is inter observer agreement?
    multiple judges --> makes observing behavior more reliable.
  • what is percent agreement?
    how much the observers agree on what the behavior is.
  • what is the kappa statistic?
    best method for assessing the level of agreement among several observers.
  • what is standard error of measurement?
    we need to use the standard deviation and the reliability coefficient.
  • what is discriminability analysis?
    examine the correlation between each item and the total score for the test; form of item analysis.
  • what do you know about low reliability?
    increasing reliability by adding more items; measure only one thing!; use factor analysis (data reductions, make data manageable) and item-total correlations.
  • what is factor analysis?
    method of finding the minimum number of dimensions (characteristics, attributes), called factors (helps account for a large number of variables).
  • how do you find someones observed score?
    everyone has a true score on any test! observed score = true score +/- error
  • a test CAN'T be valid if it is?
    NOT RELIABLE
  • you CANT have reliability without validity but you?
    CAN have validity without reliability