Population and Sample

Cards (23)

    • Population a complete group of people or animals
  • Target population - group being studied
  • Sample
    • part of a population selected for research
    • must be representative to generalize
  • Sampling techniques are methods used to select a subset of individuals or items from a larger population for the purpose of research or statistical analysis.
  • sampling techniques can be divided into two types:
    • Probability or random sampling
    • Non- probability or non- random sampling
  • Probability Sampling:
    • Simple Random Sampling
    • Stratified Random Sampling
    • Cluster Sampling
    • Systematic Sampling
    • Multi Stage Sampling
  • Non-probability sampling:
    • Quota Sampling
    • Convenience Sampling
    • Judgment Sampling
    • Snowball Sampling
  • Probability Sampling - involves selecting individuals or items from a population in such a way that each member of the population has an equal chance of being chosen
  • Probability sampling - achieved through random selection methods like lottery or random number generators.
  • probability sampling - provides a representative sample when the population is homogeneous and well-defined
  • Simple random sampling - the most basic random sampling wherein each element in the population has an equal probability of being selected.
  • Slovin's formula is a statistical tool to calculate the minimum sample size needed to estimate a statistic based on an acceptable margin of error.
  • Systematic Random Sampling - a random sampling that uses a list of all the elements in the population and then elements are being selected based on the kth consistent intervals.
  • Systematic Random Sampling - less time-consuming than simple random sampling and can be easily implemented with ordered lists.
  • Cluster Sampling - involves dividing the population into clusters or groups, often based on geographic location or other naturally occurring divisions
  • clusters are too large and there is a need for a second set of smaller clusters to be taken from the original clusters. This technique is called multi-stage cluster sampling
  • Measures of variability, also known as measures of dispersion, quantify the extent to which data points in a dataset spread out or vary from the central tendency.
    • The range is the difference in the maximum and minimum values of a data set.
    • The maximum is the largest value in the dataset and the minimum is the smallest value.
    • The range is easy to calculate but it is very much affected by extreme values.
  • The interquartile range is a measure of statistical dispersion, which is calculated as the difference between the upper quartile (Q3) and the lower quartile (Q1) in a dataset.
    • It is less sensitive to outliers compared to the range.
    • It is not affected by extreme values. It is thus a resistant measure of variability.
  • Variance measures the average squared deviation of each data point from the mean of the dataset.
    • It is calculated by taking the average of the squared differences between each data point and the mean.
  • Standard deviation is the square root of the variance and provides a measure of the dispersion of data points around the mean.
    • It indicates the typical distance between each data point and the mean
  • Mean absolute deviation measures the average absolute deviation of each data point from the mean of the dataset.
    • It is calculated by taking the average of the absolute differences between each data point and the mean.
  • The coefficient of variation is a relative measure of variability that expresses the standard deviation as a percentage of the mean.
    • It is useful for comparing the variability of datasets with different units or scales