STATS MIDTERM (COMPLETE DEFINITION)

Cards (59)

  • Population
    Refers to the whole group under study or investigation
  • Sample
    A subset taken from a population, either by random sampling or by non-random sampling
  • Random Sampling
    A method of selecting a section of the population for study where all subjects have the same chances of being chosen
  • Probability Sampling
    A method where all members of the population have an equal chance to be part of the sample
  • Simple Random Sampling (SRS)

    Each possible sample has an equal chance of being picked and every member of the population has an equal chance of being included in the sample
  • Table of random variable
    A table containing rows and columns of mechanically generated digits
  • Systematic sampling
    Samples are selected at intervals called sample intervals, every nth item in the list is selected from a randomly selected starting point
  • Stratified sampling
    An extension of simple random sampling which allows for different homogeneous groups, called strata, in the population to be represented in the sample
  • Cluster or area sampling
    The entire population is broken into small groups or clusters, and then some of the clusters are randomly selected for analysis
  • Non-probability sampling
    A sampling method where not all individuals of the universe have an equal opportunity of becoming part of the sample
  • Convenience sampling
    The most common type of non-probability sampling, focusing on gaining information from participants who are "convenient" for the researcher to access
  • Volunteer sampling
    Participants self-select to become part of a study because they volunteer when asked or respond to an advert
  • Purposive sampling
    An expert selects a representative sample based on subjective judgment or purpose for the study
  • Quota sampling
    Sample units are picked for convenience but certain quotas are given to the interviewers, especially used in market research
  • Snowball sampling
    Additional sample units are identified by asking previously picked sample units for people they know who can be added to the sample, used when the topic is not common or the population is hard to access
  • Parameter
    A descriptive population measure, a measure of the characteristics of the entire population based on all elements within that population
  • Statistic
    The number that describes the sample, a characteristic of a population or sample group
  • The mean is the average value.
  • The median is the middle number when ordered from smallest to largest.
  • The mode is the most frequently occurring value.
  • The median is the middle value.
  • Population data set
    Contains all members of a specified group (the entire list of possible data values)
  • Sample data set
    Contains a part, or a subset, of a population
  • Normal Distribution
    An example of a continuous distribution, pertaining to a family of bell-shaped curves that model a number of continuous variables
  • Normal Distribution
    Also known as Gaussian Distribution
  • Normal curve
    Bell-shaped curve that lies entirely above the horizontal axis, symmetrical, unimodal, and asymptotic to the horizontal axis, with the area between the curve and the horizontal axis exactly equal to 1
  • Normal Distribution
    Determined by two parameters: the mean and the standard deviation
  • About 68.3% of the area under the curve falls within 1 standard deviation of the mean
  • About 95.4% of the area under the curve falls within 2 standard deviations of the mean
  • About 99.7% of the area under the curve falls within 3 standard deviations of the mean
  • Standard Scores

    Measures how many standard deviation a given value (x) is above or below the mean
  • Positive z-score
    Above the mean
  • Negative z-score
    Below the mean
  • There are two types of z-score: Sample and Population
  • Use sample when it doesn't specify anything
  • Use population when it says population mean or standard deviation
  • Use z-score if the horizontal axis does not show standard deviation
  • Empirical rule also known as the 68-95-99.7 rule, represents the percentages of valuesinterval for a normal distribution. Thatis, 68% of data is within one standard deviationof the mean, 95% of data is within two standarddeviation of the mean and 99.7% of data iswithin three standard deviation of the mean.
  • Variance of a random variable 𝑋 is denoted by 𝜎² can likewise be written as 𝑉𝑎𝑟 (𝑋).
  • The variance of a random variable is the expected value of the square of the difference between the assumed value of random variable and the mean.