STAT 178 Chapter 1

Cards (23)

  • Statistics is concerned with the collection, organization, summarization, and analysis of data
  • Statistics involves drawing inferences about a body of data when only a part of the data is observed
  • The purpose of statistics is to investigate and evaluate the nature of numbers and the meaning of obtained information
  • Sources of data:
    • Documented data: primary data documented by the primary source and secondary data documented by a secondary source
    • Survey: method of collecting data by asking people questions
    • Experiments: method of collecting data with direct human intervention on conditions
    • Observation: method of collecting data by recording observations
    • Other sources: internal data, registration, computer simulations
  • Biostatistics is the application of statistical tools and concepts in the biological sciences and medicine
  • Variables:
    • A variable is a characteristic that takes on different values in different persons, places, or things
    • Quantitative variables can be measured and convey information regarding amount
    • Qualitative variables cannot be measured and convey information regarding attribute
  • Random variable:
    • A random variable is a result of change factors and cannot be exactly predicted in advance
    • Discrete random variable: characterized by gaps or interruptions in the values it can assume
    • Continuous random variable: does not possess gaps or interruptions, can assume any value within a specified interval
  • Population: a collection of all units from which data is collected
    • Sample: a subset or representative part of the population
    • Measurement: assigning a number to a characteristic being measured
  • Scales of measurement:
    • Nominal scale: possesses only the property of identity
    • Ordinal scale: possesses identity and order but not equality of scale
    • Interval scale: possesses identity, order, and equality of scale but not absolute zero
    • Ratio scale: possesses all properties of identity, order, equality of scale, and absolute zero
  • Sampling methods:
    • Probability sampling: every element in the population has a non-zero chance of being chosen
    • Simple random sampling: all possible subsets have the same chances of selection
    • Systematic random sampling: selection of the first element is random and subsequent elements are taken at regular intervals
    • Stratified sampling: dividing the population into nonoverlapping subpopulations and selecting samples from each stratum
  • Proportional allocation method assigns equal probabilities for all elements by allocating them proportionately to the sizes of the strata
  • Cluster sampling divides the population into nonoverlapping groups or clusters, selects a sample of clusters, and includes all elements in the selected clusters
  • Two-stage sampling identifies elements in the sample at the second stage, while three-stage sampling identifies elements at the third stage
  • Multistage sampling is a natural extension of one-stage cluster sampling and is more cost-efficient when clusters are large and elements are homogeneous
  • Nonprobability sampling methods do not use randomization and allow researchers to subjectively choose sampling units
  • Haphazard or convenience sampling includes elements that are most accessible or easiest to contact, based solely on convenience
  • Judgment or purposive sampling selects respondents based on the judgment or opinion of the researcher, leading to personal biases and exclusion of other units
  • Quota sampling subdivides the population into subgroups, determines a quota for each stratum, and fills the quota using convenience or judgment sampling
  • Snowball sampling starts with initial samples taken by SRS and expands through referrals, often done through social networks
  • Advantages of nonprobability sampling include convenience and cost-effectiveness, but limitations include lack of representativeness, bias, and inability to determine sampling error
  • Observation leads to the formulation of questions or uncertainties that can be answered scientifically
  • Hypotheses are formulated to explain observations and make quantitative predictions of new observations, often generated after extensive background research and literature reviews
  • Criteria for designing an experiment include accuracy and precision, where accuracy refers to the correctness of a measurement and precision refers to the consistency of a measurement