Statistical method for estimating a population parameter from a sample
Topics covered
Review and Preview
Sampling
The Central Limit Theorem
Estimating a Population Proportion
Estimating a Population Mean
Estimating a Population Standard Deviation or Variance
Sampling frame
List of subjects in the population from which the sample is taken
Simple random sample
Each possible sample of that size has the same chance of being selected
Selecting a simple random sample
1. Number the subjects in the sampling frame
2. Generate a set of those numbers randomly
3. Sample the subjects whose numbers were generated
Bias
When results from the sample are not representative of the population
Types of bias
Undercoverage
Sampling bias
Nonresponse bias
Response bias
Convenience sample
A type of survey sample that is easy to obtain relatively cheaply
Volunteer sample
Most common type of convenience sample where subjects volunteer for the sample
A simple random sample of 100 people is better than a volunteer sample of thousands of people
Steps in sampling
1. Identify the population
2. Construct a sampling frame
3. Use a random sampling design to select n subjects
4. Be cautious about sampling bias and other biases
Random sampling methods
Simple random sampling
Cluster random sampling
Stratified random sampling
Cluster random sampling
Divide the population into a large number of clusters, select a simple random sample of the clusters, use the subjects in those clusters as the sample
Stratified random sampling
Divide the population into separate groups (strata), select a simple random sample from each stratum
The Central Limit Theorem states that for a population with any distribution, the distribution of the sample means approaches a normal distribution as the sample size increases
Mean of the sample means
Equal to the population mean μ
Standard deviation of the sample means
Equal to σ/√n, where σ is the population standard deviation and n is the sample size
For samples of size n larger than 30, the distribution of the sample means can be approximated reasonably well by a normal distribution
If the original population is normally distributed, then for any sample size n, the sample means will be normally distributed
As the sample size increases, the sampling distribution of sample means approaches a normal distribution
As we proceed from n = 1 to n = 50
The distribution of sample means is approaching the shape of a normal distribution
Elevator capacity
Maximum capacity of 16 passengers with a total weight of 2500 lb
Male weights
Follow a normal distribution with a mean of 182.9 lb and a standard deviation of 40.8 lb
If the elevator is filled to capacity with all males, there is a very good chance the safe weight capacity of 2500 lb will be exceeded
Finite population correction factor
When sampling without replacement and the sample size n is greater than 5% of the finite population of size N, adjust the standard deviation of sample means by multiplying it by the finite population correction factor
Topics covered
Sampling
The Central Limit Theorem
Estimating a Population Proportion
Estimating a Population Mean
Estimating a Population Standard Deviation or Variance
Point estimate
A single value (or point) used to approximate a population parameter
Sample proportion
The best point estimate of the population proportion
Confidence interval
A range (or an interval) of values used to estimate the true value of a population parameter
Confidence level
The probability 1-α (often expressed as the equivalent percentage value) that the confidence interval actually does contain the population parameter, assuming that the estimation process is repeated a large number of times
The correct interpretation of a confidence interval is that we are X% confident that the interval contains the true value of the population parameter
Critical value
The number on the borderline separating sample statistics that are likely to occur from those that are unlikely to occur
The z score separating the right-tail region is commonly denoted by zα/2 and is referred to as a critical value
Critical Values
90% confidence level, zα/2 = 1.645
95% confidence level, zα/2 = 1.96
99% confidence level, zα/2 = 2.575
Margin of error
The maximum likely difference (with probability 1-α) between the observed proportion and the true value of the population proportion
The assumptions required for using the margin of error formula are: 1) simple random sample, 2) binomial distribution conditions satisfied, 3) at least 5 successes and 5 failures
Steps to find the margin of error and confidence interval
1. Verify assumptions
2. Find critical value zα/2
3. Evaluate margin of error E
4. Find confidence interval limits p̑-E < p < p̑+E
5. Round confidence interval limits to 3 significant digits
When analyzing polls, the key is to ensure the required assumptions are satisfied
Finding the margin of error
1. Use the formula
2. 2020/2021
3. SIS 1037Y
4. 59
Margin of error (E)
The amount the sample percentage is likely to differ from the true population percentage