It is an index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance.
Reliability Coefficient
A statistic useful in describing sources of test score variability.
Variance
It is a variance from true differences.
True variance
It is a variance from irrelevant, random sources.
Error variance
Refers to the proportion of the total variance attributed to true variance.
Reliability
Refers to collectively all of the factors associated with the process of measuring some variable other than the variable being measured.
Measurement error
It is a source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process.
Random error
It refers to a source of error in measuring a variable that is typically constant or proportionate to what is presumed to be the true value of the variable being measured.
Systematic error
Terms that refer to variation among items within a test as well as to variation among items between tests.
Item sampling or content sampling
Sources of Error Variance:
Test Construction
Test Administration
Test Scoring and Interpretation
Other sources of error (surveys and pools are two tools of assessment commonly used by researchers who study public opinion)
The reliability of this instrument of measurement may also be said to be stable over time.
Test-Retest Reliability
When the interval between testing is greater than six months, the estimate of test-retest reliability is often referred to as _____________.
Coefficient of stability
The degree of the relationship between various forms of a test can be evaluated by means of an alternate forms or parallel forms coefficient reliability which is often referred termed the _______________.
Coefficient of equivalence
Refers to an estimate of the extent to which item sampling and other errors have affected test scores on versions of the same test.
Parallel forms reliability
Refers to an estimate of the extent to which these different forms of the same test have been affected by item sampling error or other error.
Alternate forms of reliability
This is obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once.
Split half reliability
This could also be used to determine the number of items needed to attain a desired level of reliability.
Spearman-Brown formula
Refers to the degree of correlation among all the items on a scale.
Inter-item consistency
It is the degree to which a test measures a single trait.
Homogeneity
Describes the degree to which a test measures different factors.
Heterogeneity
A statistic of choice for determining the inter-item consistency of dichotomous items.
Kuder-Richardson formula or KR-20
Preferred statistic for obtaining an estimate of internal consistency reliability.
Coefficient alpha
It is a measure that focuses on the degree of difference that exists between item scores.
Average proportional distance method
It is the degree of agreement or consistency between two or more scorers with regard to a particular measure.
Inter-scorer reliability
A test wherein a time limit is long enough to allow test takers to attempt all items and if some items are so difficult that no test taker is able to obtain a perfect score.
Power test
Generally contains items of uniform level of difficulty so that when given generous time limits, all test takers should be able to complete all the test items correctly.
Speed test
It is designed to provide an indication of where a test taker stands with respect to some variable or criterion.
Criterion-referenced test
This seek to estimate the extent to which specific sources of variation under defined conditions are contributing to the test score.
Domain sampling theory
It is based on the idea that a person's test scores vary from testing because of variables in testing situation.
Generalizability theory
In the context of item response theory this signifies the degree to which an item differentiates among people with higher or lower levels of the trait, ability, or whatever it is being measured.
Discrimination
Test items that can be answered with only one of two alternative responses.
Dichotomous test items
Test items with three or more alternative responses, where only one is scored correct or scored as being consistent with targeted trait or other construct.
Polytomous test items
It provides a measure of the precision of an observed test score.