To provideevidenceregardingmeasurementproperties of quantifiedvariables.
Levels of measurement
Nominal
Ordinal
Interval
Ratio
Nominal
Lowestlevel, involves using numberssimplytocategorizeattributes, the numerical value is simply a placeholder
Ordinal
Rankspeopleonattributes, categories imply some sortofranking
Interval
Rankspeopleonanattribute and specifiesthedistancebetweenthem; notruezerovalue (arbitrary zero)
Ratio
Highestlevelofmeasurement, has a meaningfulzero and providesinformation about the absolutemagnitude of the attribute
Descriptive statistics
Used to synthesizeanddescribedata, provides simpledescription and summary about the sample and observations
Parameter
Populationvalue
Statistic
Samplevalue
Univariate
Onevariable
Symmetrical distribution
When folded over, the two halves of a frequency polygon would be superimposed
Normal distribution
Bell or normal shaped curve, unimodal, gaussian
Asymmetrical distribution
(+) skew - longer tail points to the right
(-) skew - longer tail points to the left
Central tendency
Provides an overall summary but does not clarify the patterns of data
Indexes of central tendency
Mode
Median
Mean
Mode
Most numerical value that occurs most frequently, most popular value
Median
Middle value, does not take into account individual values and is insensitive to extremes
Mean
The sum of all values divided by the number of participants, the most stable
Variability (dispersion)
How the values are different from the mean
Range
The highest minus of the lowest score in a distribution
Standard deviation
Captures the degree to which the scores deviate from one another, shows the homogeneity or heterogeneity of the dataset
Bivariate
Two variables
Crosstabulations
A two-dimensional frequency distribution in which the frequencies of two variables are crosstabulated
Correlation
Used to describe the relationship between two variables - to what extent are the two variables related to each other
Pearson's r
The product-moment correlation coefficient, the most widely used correlation statistic, computed with continuous measures
Spearman's Rho
A correlation index used for ordinal level data or when sample sizes are very small
Inferential statistics
Based on the laws of probability, provide a means for drawing inferences about a population, given data from a sample
Parameter estimation
Used to estimate population parameter - e.g. a mean, a proportion, or a difference in means between two groups
Point estimation
Involves calculating a single statistic to estimate the parameter
Interval estimation
Provides a range of values within which the parameter has a specified probability of lying (dependent on confidence interval)
Confidence interval (CI)
An interval estimation based on confidence level
Hypothesis testing
Type I and Type II errors
Type I error
False-positive (accept), occurs if an investigator rejects a null hypothesis that should be accepted
Type II error
False-negative (reject), occurs if an investigator fails to reject a null hypothesis that should be rejected
Confidence interval
Provides a range of values within which the parameter has a specified probability of lying (dependent on confidence interval)
Used in sampling computation
Based on confidence level (most common is 95 confidence level) - can have the same sampling estimation as parameter estimation
Parameter estimation
When the population SD is unknown, the interval estimate can be determined using student's t-distribution
Sample problem
The mean age of the sample of 25 students is 18 years, and the standard deviation is 1.3 years. Find the interval estimate of the population mean using 95% CL (2.064)
Degree of freedom
n - 1
Margin of error formula
E - margin of arrow, t - statistics, s - sample SD, n - sample size
Type II error
False-negative, occurs if an investigator accepts a null hypothesis that should be rejected