Statistics

Subdecks (1)

Cards (170)

  • Raw Data
    Unprocessed. Just been collected. Needs to be ordered, grouped, rounded, cleaned.
  • Qualitative Data

    Non-numerical, descriptive data such as eye/hair colour or gender. Often subjective so usually more difficult to analyse.
  • Quantitative Data
    Numerical data. Can be measured with numbers. Easier to analyse than quantitative data. Example, height, weights, marks in an exam etc.
  • Discrete Data

    Only takes particular values (not necessarily whole numbers) such as shoe size or number of people.
  • Continuous Data
    Can take any value e.g. height, weight.
  • Categorical Data
    Data that can be sorted into non-overlapping categories such as gender. Used for qualitative data so that it can be more easily processed.
  • Ordinal (rank) Data
    Quantitative data that can be given an order or ranked on a rating scale, e.g. marks in an exam.
  • Bivariate Data

    Involves measuring 2 variables. Can be qualitative or quantitative, grouped or ungrouped. Usually used with scatter diagrams where the two axes represent the two different variables. One variable is often called the explanatory variable and the other the response variable.
  • Multivariate Data

    Made up of more than 2 variables e.g. comparing height, weight, age and shoe size together.
  • Grouping Data
    1. Grouping data using tables makes it easier to spot patterns in the data and quickly see how the data is distributed.
    2. Discrete data can be grouped into classes that do not overlap e.g. 0-10, 11-15… (they do not have to have equal class width). Uses smaller intervals when there is a lot of data close together in that range and wider classes for data that is more spread out.
    3. Continuous data can be grouped using inequalities. The class intervals must not have gaps between them or be overlapping so inequality symbols must be used with one of the symbols being < and the other ≤.
  • Pros of Grouping Data
    • Makes the data easy to read and understand
    • Easy to spot patterns and compare data
  • Cons of Grouping Data

    • Loses accuracy of data as you no longer know exact data values
    • Calculations made from these will only be an estimate e.g. mean
  • Primary Data
    Data that you have collected yourself, or someone has collected on your behalf.
  • Secondary Data
    Data that has already been collected.
  • Population
    Everyone or everything that could be involved in the investigation e.g. when investigating opinions of students in a school the population would be all the students in the school.
  • Census
    A survey of the entire population.
  • Sample
    A smaller number from the population that you actually survey. The data obtained from the sample is then used to make conclusions about the whole population, so it is important that the sample represents the population fairly.
  • Sampling Frame

    A list of all the members of the population. This is where you will choose the sample from. E.g. electoral roll, school register.
  • Sampling Unit
    The people that are to be sampled e.g. students in a school.
  • Biased Sample
    A sample that does not represent the population fairly. Example, if surveying students at a mixed school and the sample only contains girls. Avoid bias by using random sampling methods.
  • Simple Random Sampling
    1. Every item/person in the population has an equal chance of being selected.
    2. Assign a number to every member in the population.
    3. Mention the random sampling technique you are going to use e.g. a random number table or a random number generator on a calculator.
    4. Select the numbers chosen from your population.
    5. Ignore any repeats and choose another number.
  • Advantages of Simple Random Sampling
    • Sample is representative as every member of the population has an equal chance of being selected
    • Unbiased
  • Disadvantages of Simple Random Sampling
    • Need a full list of population (not always easily obtainable)
    • Not always convenient as it can be expensive and time consuming
    • Needs a large sample size
  • Stratified Sampling

    1. The size of each strata (group) in the sample is in proportion to the sizes of strata in the population. E.g. if group A accounts for 10% of the population, in the sample group A will also be 10% of the sample size.
    2. Split the population into groups (usually done for you in the exam)
    3. Use the formula 𝒔𝒕𝒓𝒂𝒕𝒊𝒇𝒊��𝒅 𝒔𝒂𝒎𝒑𝒍𝒆 =
    𝒔𝒕𝒓𝒂𝒕𝒂
    𝒕𝒐��𝒂�� × 𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆 to calculate sample size for each group. (remember to check totals if you rounded numbers and adjust accordingly if your total sample size after stratification is bigger/smaller than sample size in the question).
    4. Use random sampling to select members from each strata/group.
  • Advantages of Stratified Sampling
    • Sample is in proportion to population, so sample represents the population fairly
    • Best used for populations with groups of unequal sizes
  • Disadvantages of Stratified Sampling
    • Time consuming
  • Systematic Sampling
    1. Choosing items in the population at regular intervals.
    2. Divide your population size by sample size to calculate the intervals, e.g. 400/40 = 10 so choosing every 10th item in the population.
    3. Use random sampling to generate a number between 1 and 10 (or the answer to your calculation from above) to choose a starting point e.g. 7.
    4. Select every 10th item after the 7th e.g. 7th, 17th, 27th, …, until you obtain your sample size.
  • Advantages of Systematic Sampling
    • Population is evenly sampled
    • Can be carried out by a machine
    • Sample is easy to select
  • Disadvantages of Systematic Sampling
    • Not strictly a random sample as some member of the population cannot be chosen
  • Cluster Sampling
    The population is divided into natural groups (clusters), groups are chosen at random and every member of groups are sampled. Useful for large populations e.g. when surveying lots of different towns in a country.
  • Advantages of Cluster Sampling
    • Economically efficient – less resources required
    • Can be representative if lots of small clusters are sampled
  • Disadvantages of Cluster Sampling
    • Clusters may not be representative of the population and may lead to a biased sample
    • High sampling error
  • Quota Sampling
    1. Population is grouped by characteristics and a fixed amount is sampled from every group.
    2. Group population by characteristics e.g. gender and age
    3. Select quota (amount) for each group e.g. 30 men under 25, 40 women over 30 etc.
    4. Obtain sample by finding members of each group until quota is reached.
  • Advantages of Quota Sampling
    • Quick to use
    • Cheap
    • Do not need sample frame or full list of the population
  • Disadvantages of Quota Sampling
    • NOT RANDOM – biased as interviewer is choosing who will be in the sample so every member of the population does not have an equal chance of being selected
  • Opportunity Sampling
    Using the people/items that are available at the time. E.g. interviewing the first 10 people you see on a Monday morning.
  • Advantages of Opportunity Sampling

    • Quick
    • Cheap
    • Easy
  • Disadvantages of Opportunity Sampling
    • NOT RANDOM. The sample has not been collected fairly so it may not represent the population and every member of the population has not been given an equal chance to be selected.
  • Judgement Sampling
    When the researcher uses their own judgement to select a sample, they think will represent the population. E.g. A teacher choosing students to interview about their opinion on a new after school club.
  • Advantages of Judgement Sampling
    • Easy
    • Quick