Statistics

Cards (56)

  • Population
    The whole set of items that are of interest
  • Census
    Observes or measures every member of a population
  • Sample
    A selection of observations taken from a subset of the population which is used to find out information about the population as a whole
  • Sampling frame

    When sampling units of a population are individually named or numbered to form a list
  • Qualitative data 

    Variables or data associated with non-numerical observations
  • Quantitative data
    Variables or data associated with numerical observations
  • Continuous variable
    A variable that can take any value in a given range
  • Discrete variable
    A variable that can take only specific values in a given range
  • Name the types of random sampling
    1. Simple random sampling
    2. Systematic sampling
    3. Stratified sampling
  • Name the types of non-random sampling
    1. Opportunity sampling
    2. Quota sampling
  • Interpolation
    Making an estimate of the value of 'y' within the range of given data
  • Extrapolation
    Making an estimate of the value of 'y' outside the range of given data
  • What are the conditions for which X can be modelled as a binomial distribution B(n,p)
    1. Fixed number of trials, n
    2. Two possible outcomes (success or failure)
    3. Fixed probability of success, p
    4. Trials are independent of each other
  • Bivariate data
    Data which has pairs of values for two variables
  • What is the multiplication rule for probability?
    P(AUB)= P(A)+P(B)-P(ANB)
  • Independent events
    When one event has no effect on the other
  • Mutually exclusive events
    When two events cannot occur at the same time/ have no outcome in common
  • Random variable
    A variable whose value depends on the outcome of a random event. They are denoted by capital letters e.g. X, Y, A, B
  • Sample space
    The range of values that a random variable can take
  • Discrete uniform distribution
    When there's an equal chance for all outcomes e.g. a fair-sided dice
  • Null hypothesis
    H0, is the hypothesis that we assume to be correct
  • Alternative hypothesis 

    H1, is the hypothesis that tells you about the parameter if your assumption is shown to be wrong
  • Critical region 

    The range of values of a test statistic "X" that would lead you to reject the null hypothesis
  • Critical values 

    The boundary values of the critical region
  • Acceptance region
    The area in which we accept the null hypothesis
  • Under what circumstance do we reject the null hypothesis?
    If the p-value is lower than the significance level
  • Under what circumstance do we accept the null hypothesis?
    If the p-value is greater than the significance level
  • Actual significance level
    The probability of incorrectly rejecting the null hypothesis; calculated by adding the probabilities within the critical region
  • Product moment correlation coefficient (PMCC)
    Describes the linear correlation between two variables; takes values between -1 and 1
  • What are the meanings of different PMCC values?
    • r= 1; perfect positive correlation
    • r= -1; perfect negative correlation
    • r= 0; no correlation
    • r= +/-0.8 onwards; strong correlation
  • What is meant by P(B|A)?
    The probability that B occurs given that A has already occurred
  • For independent events what is P(A|B) and P(B|A)?
    • P(A|B) = P(A|B') = P(A)
    • P(B|A) = P(B|A') = P(B)
  • Under what conditions can you approximate binomial as normal?
    • If n is large
    • If p is close to 0.5
  • Characteristics of a normal distribution
    1. Parameters, μ the mean and σ2 the variance
    2. Symmetrical mean=median=mode
    3. Total area under curve= 1
    4. Points of inflection, μ +/- σ
    5. Has P(X=a)=0 for any a; true for any continuous distribution
    6. Bell-shaped curve with asymptotes at either end
  • Stratified sampling
    Divide the population into homogeneous groups (strata) and randomly select samples from each group
    1. Calculate the number of people/items you need from each strata using the formula: Number sampled in stratum= number in stratum/number in population x overall sample size
    2. Allocate each person a unique number
    3. Carry out a simple random sample for each stratum
  • Simple random sampling

    A sample of size "n" where every sample of size "n" has an equal chance of being selected
    1. Form a sampling frame (obtain a list of items/people)
    2. Assign each item/person a unique number
    3. Using a random number generator, select n of these
  • Systematic sampling
    The required elements are chosen at regular intervals from an ordered list
    1. Form a sampling frame/ calculate how many people or items needed in the sample
    2. Allocate each person/ item a unique number
    3. Randomly select a number from a given/ calculated range (e.g. 1-10) using a random number generator
    4. Then select every 10th (for example) until the required amount of people/ items are selected
  • Opportunity sampling
    Consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for
  • Quota sampling
    An interviewer or researcher selects a sample that reflects the characteristics of the whole population
  • Advantages of a census
    It should give a completely accurate result