Statistics

Cards (42)

  • Census
    Observes or measures the entire population
  • Population
    The whole set of items that are of interest
  • Sample
    A small subset of the population (represents)
  • Census : Advantages and Disadvantages
    + should give a completely accurate result
    -time consuming
    -expensive
    -cannot be done if testing destroys product
    -hard to process large quantities of data
  • Sample : Advantages and Disadvantages
    +Less time consuming and mor inexpensive than a census
    +fewer people have to respond
    +less to process
    -less accurate
    -may not be large enough to consider/give information about small groups within the population.
  • Sampling Unit
    an individual unit of a population
  • Sampling Frame
    Sampling units are individually named or numbered to form a list
  • 3 Types of Non-Random sampling
    • Simple Random sampling
    • Systematic sampling
    • Stratified sampling
  • Process of Simple Random sampling?
    1. Create a sampling frame for the population and use it as the sample size (range) in a RNG
    2. Generate and select the corresponding unit.
    3. If a number is repeated, ignore it and repick.
    4. Repeat until desired sample size is achieved.
  • Simple Random sampling: Advantages and Disadvantages
    +Free of bias
    +Cheap and easy for small populations
    +Each sampling unit has an equal chance of selection.
    -Not suitable for large populations
    -Large populations can be time consuming and expensive
    -Sampling fram required.
  • Process of Systematic sampling?
    1. Create a sampling frame
    2. Calculate a regular interval to choose values from according to sample size required. eg: sample size 20 from total 100, 100/20= 5, pick every 5.
    3. Use a RNG with range of sampling frame to choose the first person
  • Systematic sampling: Advantages and Disadvantages
    +simple and quick to use
    +suitable for large samples and large populations
    -sample frame is needed
    -can introduce bias if sampling frame is not random
  • Process of Stratified sampling?
    1. Divide the population into mutually exclusive strata (groups)
    2. ensure proportion taken to sample from each stratum is the same.
    3. By creating a sampling frame for each stratum and using the sample size as a range for an RNG, do simple random sample of each until desired sample size is reached.
  • Stratified sampling equation
    number to be taken from stratum= (number in stratum/number in population)x desired sample size
  • Stratified sampling: Advantages and Disadvantages
    +sample accurately reflects population
    +guarantees proportional representation of groups within a population
    -distinct, mutually exclusive traits required.
    -selection within strata have same disadvantages as simple random sampling.
  • 2 types of non-random sampling
    • quota sampling
    • opportunity sampling
  • Process of Quota sampling?
    1. divide the population into groups based on relevant quotas e.g. age, income, gender
    2. Identify proportions for strata
    3. Recruit sampling units until quota has been reached.
    4. If a person refuses for interview or quota they fit is full, ignore and move on until all quotas are met.
  • Quota sampling : Advantages and Disadvantages
    +Allows a small sample to still be representative of a population
    +No sampling frame required
    +quick, easy, inexpensive
    +easy comparison between strata
    -non random sampling can introduce bias
    -population must be divided
    -can be costly or inaccurate
    -increasing scope of study = more groups =more time and money
  • Process of Opportunity sampling?
    1. Choose a criteria.
    2. Only choose people who fit the criteria
    3. Continue asking people who fit this criteria until ideal sample size is achieved.
  • Opportunity sampling: Advantages and Disadvantages
    +Easy to carry out.( You pick people available at the time)
    +Inexpensive
    -Unlikely to provide a representative sample
    -Highly dependent on individual researcher.
  • Measures of Central Tendency
    • Median
    • Mean
    • Mode
  • Measures of Spread
    • Range
    • Standard Deviation
    • IQR
    • Variance
  • Quantitative data definition?
    Anything numerical that you can number and count.
  • Qualitative data definition?
    Non-numeric data represent by information or labels. eg. colours
  • Stratified sampling equation?
    tosample=tosample=(stratumsize/population)overallsample(stratumsize/population)overall sample
  • Continuous data definition?

    Data that can take any value in a given range.
    e.g. temperature, time - there is a range
  • Discrete data definition?
    Data that can only take on specific values in a given range.
    e.g. shoe size, number of pages in a book
  • For coded data, what happens to standard deviation when added or subtracted?
    Nothing, it stays the same.
  • For coded data, what happens to standard deviation when multiplied or divided?
    It changes based in what was multiplied or divided.
    This is because standard deviation measures range.
  • For coded data, why is the average affected by both adding/subtracting and multiplying/dividing?
    Average is a measure of central tendency. As it isn't measuring a range, if all values +100, so would the mean
  • Outlier equation?
    any value that is less than Q1-k(Q3-Q1)
    any value that is greater than Q3+k(Q3-Q1)
  • Frequency density formula for histograms?
    Frequency density = frequency/class width
  • What is cleaning of data?
    Removal of any anomalies.
  • What are box plots used for?
    Representing quartiles, maximum and minimum values and outliers.
  • What type of data should cumulative frequency diagrams be used for?

    Data in a grouped frequency table.
    • You use the cumulative frequency diagram to help find estimates for quartiles and percentiles etc.
  • What type of data should histograms be used for?
    Grouped continuous data.
    • area of bar is proportional to frequency.
    • you have a rough shape of data spread.
  • 5 ways to describe correlation?
    • strong negative correlation
    • weak negative correlation
    • no linear correlation
    • strong positive correlation
    • weak positive correlation
  • Mutually exclusive definition?
    Events which cannot occur at the same time. No outcomes in common.
  • Mutually exclusive formula?
    P(A and B) = P(A) + P(B)
  • Independent events definition?
    Events which have no effect on each other.