Statistics

    Cards (56)

    • Population
      The whole set of items that are of interest
    • Census
      Observes or measures every member of a population
    • Sample
      A selection of observations taken from a subset of the population which is used to find out information about the population as a whole
    • Sampling frame

      When sampling units of a population are individually named or numbered to form a list
    • Qualitative data 

      Variables or data associated with non-numerical observations
    • Quantitative data
      Variables or data associated with numerical observations
    • Continuous variable
      A variable that can take any value in a given range
    • Discrete variable
      A variable that can take only specific values in a given range
    • Name the types of random sampling
      1. Simple random sampling
      2. Systematic sampling
      3. Stratified sampling
    • Name the types of non-random sampling
      1. Opportunity sampling
      2. Quota sampling
    • Interpolation
      Making an estimate of the value of 'y' within the range of given data
    • Extrapolation
      Making an estimate of the value of 'y' outside the range of given data
    • What are the conditions for which X can be modelled as a binomial distribution B(n,p)
      1. Fixed number of trials, n
      2. Two possible outcomes (success or failure)
      3. Fixed probability of success, p
      4. Trials are independent of each other
    • Bivariate data
      Data which has pairs of values for two variables
    • What is the multiplication rule for probability?
      P(AUB)= P(A)+P(B)-P(ANB)
    • Independent events
      When one event has no effect on the other
    • Mutually exclusive events
      When two events cannot occur at the same time/ have no outcome in common
    • Random variable
      A variable whose value depends on the outcome of a random event. They are denoted by capital letters e.g. X, Y, A, B
    • Sample space
      The range of values that a random variable can take
    • Discrete uniform distribution
      When there's an equal chance for all outcomes e.g. a fair-sided dice
    • Null hypothesis
      H0, is the hypothesis that we assume to be correct
    • Alternative hypothesis 

      H1, is the hypothesis that tells you about the parameter if your assumption is shown to be wrong
    • Critical region 

      The range of values of a test statistic "X" that would lead you to reject the null hypothesis
    • Critical values 

      The boundary values of the critical region
    • Acceptance region
      The area in which we accept the null hypothesis
    • Under what circumstance do we reject the null hypothesis?
      If the p-value is lower than the significance level
    • Under what circumstance do we accept the null hypothesis?
      If the p-value is greater than the significance level
    • Actual significance level
      The probability of incorrectly rejecting the null hypothesis; calculated by adding the probabilities within the critical region
    • Product moment correlation coefficient (PMCC)
      Describes the linear correlation between two variables; takes values between -1 and 1
    • What are the meanings of different PMCC values?
      • r= 1; perfect positive correlation
      • r= -1; perfect negative correlation
      • r= 0; no correlation
      • r= +/-0.8 onwards; strong correlation
    • What is meant by P(B|A)?
      The probability that B occurs given that A has already occurred
    • For independent events what is P(A|B) and P(B|A)?
      • P(A|B) = P(A|B') = P(A)
      • P(B|A) = P(B|A') = P(B)
    • Under what conditions can you approximate binomial as normal?
      • If n is large
      • If p is close to 0.5
    • Characteristics of a normal distribution
      1. Parameters, μ the mean and σ2 the variance
      2. Symmetrical mean=median=mode
      3. Total area under curve= 1
      4. Points of inflection, μ +/- σ
      5. Has P(X=a)=0 for any a; true for any continuous distribution
      6. Bell-shaped curve with asymptotes at either end
    • Stratified sampling
      Divide the population into homogeneous groups (strata) and randomly select samples from each group
      1. Calculate the number of people/items you need from each strata using the formula: Number sampled in stratum= number in stratum/number in population x overall sample size
      2. Allocate each person a unique number
      3. Carry out a simple random sample for each stratum
    • Simple random sampling

      A sample of size "n" where every sample of size "n" has an equal chance of being selected
      1. Form a sampling frame (obtain a list of items/people)
      2. Assign each item/person a unique number
      3. Using a random number generator, select n of these
    • Systematic sampling
      The required elements are chosen at regular intervals from an ordered list
      1. Form a sampling frame/ calculate how many people or items needed in the sample
      2. Allocate each person/ item a unique number
      3. Randomly select a number from a given/ calculated range (e.g. 1-10) using a random number generator
      4. Then select every 10th (for example) until the required amount of people/ items are selected
    • Opportunity sampling
      Consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for
    • Quota sampling
      An interviewer or researcher selects a sample that reflects the characteristics of the whole population
    • Advantages of a census
      It should give a completely accurate result
    See similar decks