STAT603 CH4: Basic Statistical Inference

Created by

Lebo Lamola

Cards (165)

Descriptive statistics 
The collection, organisation, summarization and presentation of data
View source
Inferential statistics 
Involves using samples to draw conclusions about a population and express the results in language of probability
View source
Population 
All items with a characteristic of interest (size N)
View source
Census 
A study of all items in the population
View source
Sample 
A subset of the population (size n)
View source
Parameter 
A measure of description from a Population
View source
Statistic 
A measure of description from a Sample
View source
Statistical Inference 
Hypothesis Tests
Estimation
View source
Point Estimate 
A single value that estimates the parameter
View source
Interval Estimate 
A range of values that estimate the parameter, associated with some chance that the parameter lies in this interval
View source
A sampling distribution arises when repeated samples of the same size are drawn from a particular population (distribution) and a statistic (numerical measure of description of sample data, e.g. a mean, variance or proportion) is calculated for each sample
View source
The interest is then focused on the probability distribution (called the sampling distribution) of the statistic
View source
Sampling distributions arise in the context of statistical inference i.e. when statements are made about a population on the basis of random samples drawn from it
View source
The mean and variance of the sampling distribution of the sample mean (X-bar) are: E(X-bar) = μ and Var(X-bar) = σ^2/n
View source
Central Limit Theorem (CLT) 
If X1, X2, ..., Xn are a random sample of size n drawn from a population (with any distribution) with a population mean μ and variance σ^2, then for a sufficiently large n, the mean of the sample (X-bar) will be approximately normally distributed with a mean μ and a variance σ^2/n
View source
The size of n depends on the distribution of the population: for a normal distribution, the CLT holds for any value of n; for an 'almost' normal distribution, n should be larger than 30; if the distribution is substantially different from normal, a much larger value of n will be needed for the CLT to hold
View source
The basis of many statistical inference methods (hypothesis tests, confidence intervals, statistical models) is formed from the normal distribution, hence such methods require normality (an assumption is that the underlying population is normal)
View source
When the assumption of normality is not met, these methods will not be accurate, and other methods such as non-parametric methods or machine learning methods that do not require normality can be considered
View source
Interval estimate 
A range of values that estimate a parameter, associated with a percentage of confidence that the range will contain the parameter
View source
An interval estimate is more appropriate and useful than a point estimate, since a point estimate can differ each time depending on the sample obtained
View source
Point estimate 
A single value that estimates a parameter
View source
Interval estimate 
A range of values from L (lower value) to U (upper value) that estimate a parameter
View source
Confidence interval 
A range of values from L (lower value) to U (upper value) that estimates a population parameter θ with (1-α)100% confidence
View source
Confidence interval example 
Mean service time of 1.637 minutes to 4.009 minutes
View source
Population parameter θ 
Can be μ, σ2 or p
View source
Determining confidence interval for population mean μ (population variance σ2 known)
1. Point estimate ẋ
2. Error E
3. Interval estimate (ẋ-E, ẋ+E)
View source
Narrower confidence interval is more informative
View source
Factors affecting width of confidence interval for μ
Z (based on α)
Standard error (σ/√n)
View source
Calculating 95% and 99% confidence intervals
Given: σ=5, n=30, ẋ=498.5
95% CI: (496.71, 500.29)
99% CI: (496.15, 500.85)
View source
Confidence interval for μ (σ2 unknown) 
ẋ ± t(n-1)(s/√n)
View source
distribution 
Bell-shaped, symmetric around μ=0, σ2>1, approaches normal distribution as degrees of freedom increase
View source
Most statistical software reports confidence intervals based on t-values
View source
Interpreting confidence interval 
We are (1-α)100% confident that the mean falls between L and U
View source
Parameter 
The specific value (or range of values) of a population characteristic that is known/assumed
View source
Statistical hypothesis 
An assertion (claim) made about the value(s) of a population parameter
View source
The conclusion about the truth of a claim is not stated with absolute certainty, but rather in terms of the language of probability
View source
Claims to be tested
A supermarket receives complaints that the mean content of "1 kilogram" sugar bags is less than 1 kilogram
An electrical firm claims the average lifetime of their light bulbs is more than 780 hours
A construction company believes the average compressive strength of its concrete is at the required level of 4000psi
View source
Null hypothesis (H0) 
A statement concerning the exact value of the population parameter of interest (θ) from the claim that is made
View source
Alternative hypothesis (H1) 
A statement concerning the possible range of values of the population parameter θ that is believed to be true if H0 is not true
View source
Null and alternative hypotheses for the examples
Example 1: H0: μ = 1, H1: μ < 1
Example 2: H0: μ = 780, H1: μ > 780
Example 3: H0: μ = 4000, H1: μ ≠ 4000
View source

See similar decks

STAT603 CH4: Basic Statistical Inference

Cards (165)

2. Statistical Inference

Unit 8: Inference for Categorical Data: Chi-Square

6.1 Statistical Measures

2.1 Estimation

8.4 The Chi-Square Test for Independence

Statistics

6.1 Statistical Measures

6.1 Statistical Measures

2.2 Regression and Correlation

2.1 Statistical Sampling

2.4 Statistical Distributions

2.4 Statistical Distributions

7.3.1 Statistical Methods

2.4 Statistical Distributions

Unit 9: Inference for Quantitative Data: Slopes

Unit 6: Inference for Categorical Data: Proportions

6.1 Introducing Statistics: Why Be Normal?

6.1 Statistical Measures

6. Statistics

1. Sociology Basics

8.3 The Chi-Square Test for Homogeneity