A combination of the methods of descriptive statistics and those of probability to describe and analyze probability distributions
Probability distributions
Describe what will probably happen instead of what actually did happen, and they are often given in the format of a graph, table, or formula
Random variable
A variable (typically represented by X) that has a single numerical value, determined by chance, for each outcome of a procedure
Discrete random variable
Either a finite number of values or countable number of values, where "countable" refers to the fact that there might be infinitely many values, but that they result from a counting process
Continuous random variable
Has infinitely many values, and those values can be associated with measurements on a continuous scale without gaps or interruptions
Probability distribution
The sum of all probabilities must be 1
Each probability value must be between 0 and 1 inclusive
Probability histogram
Very similar to a relative frequency histogram, but the vertical scale shows probabilities
Calculating mean, variance and standard deviation of a probability distribution
1. Mean: E(x) = ∑[xP(x)]
2. Variance: σ^2 = ∑[(x-μ)^2 P(x)]
3. Standard deviation: σ = √(σ^2)
According to the range rule of thumb, most values should lie within 2 standard deviations of the mean
Unusually high/low values
Unusually high: x successes among n trials is unusually high if P(x or more) ≤ 0.05
Unusually low: x successes among n trials is unusually low if P(x or fewer) ≤ 0.05
Binomial probability distributions
The procedure has a fixed number of trials
The trials must be independent
Each trial must have all outcomes classified into two categories (success and failure)
The probability of a success remains the same in all trials
p and q
p = probability of success, q = probability of failure (q = 1-p)
Finding binomial probabilities
1. Using the binomial probability formula: P(x) = (n!/(n-x)!x!)p^x q^(n-x)
2. Using technology (mathematical/statistical software, spreadsheets, calculators)
When sampling without replacement, consider events to be independent if n < 0.05N
Binomial distribution parameters
Mean: μ = np
Variance: σ^2 = npq
Standard deviation: σ = √(npq)
The maximum usual number of school leavers wanting to join UoM is 13. It is not unusual for everyone in the group to want to join UoM.
Poisson probability distributions
Used for describing the behaviour of rare events: events with relatively low probabilities of occurrence
SIS 1037Y
2020/2021
95% of school leavers want to join UoM. A group consists of 12 randomly selected school leavers.
The max usual number of school leavers wanting to join UoM is 13. It is not unusual for everyone in the group to want to join UoM.
Topics
Probability Distributions
Binomial Probability Distributions
Parameters for Binomial Distributions
Poisson Probability Distributions
The Standard Normal Distribution
Applications of Normal Distributions
Sampling Distributions and Estimators
Assessing Normality
Normal as Approximation to Binomial and Poisson
Poisson distribution
A discrete probability distribution which is often used for describing the behaviour of rare events: events with small probabilities
Poisson distribution
1. The random variable x is the number of occurrences of the event in an interval
2. The interval can be time, distance, area, volume, or some similar unit
3. P(x) = μxe-μ/x!
Parameter λ
Used instead of μ in Poisson distribution
Poisson distribution
The random variable x is the number of occurrences of an event over some interval
The occurrences must be random
The occurrences must be independent of each other
The occurrences must be uniformly distributed over the interval being used
Mean μ
Mean number of occurrences of the event over the interval
Variance σ2
Equal to μ in Poisson distribution
Standard deviation σ
Equal to √μ in Poisson distribution
Binomial distribution
Affected by the sample size n and the probability p
Poisson distribution
Affected only by the mean μ
In a binomial distribution the possible values of the random variable x are 0, 1, . . ., n, but a Poisson distribution has possible x values of 0, 1, 2, . . . , with no upper limit.
Assuming a Poisson distribution as a suitable model for 530 cyclones over 100 years.
Mean μ
No. cyclone/no. years = 5.3
P(2) = 5.32*e-5.3/2! = 0.0701
P(0) = 5.30*e-5.3/0!
P(1) = 5.31*e-5.3/1!
The Poisson distribution is sometimes used to approximate the binomial distribution when n is large and p is small. The larger the n and the smaller the p, the better is the approximation.
Rule of Thumb to Use the Poisson to Approximate the Binomial
n ≥ 100, np 10
The approximation is good when p < 0.05 and n > 20