Data and Modeling

Subdecks (9)

Cards (1024)

  • It is important to distinguish random from systematic variation because it can help understand the processes
  • The problem of randomness cannot be eliminated, but it can be understood through probability and stochastic thinking
  • A causal relationship can explain the world completely
  • We can only say that something is likely to occur
  • Probability
    The fraction of the number of desired outcomes over all outcomes in an experiment or observation
  • Total probability

    The fundamental rule relating marginal probabilities to conditional probabilities
  • Joint probability
    The likelihood of two events occurring together and at the same point in time
  • Random variable
    A variable taking on numerical values determined by the outcome of a random phenomenon
  • Statistical analysis attempts to separate the signal in the data from the noise
  • Random variation
    Variability of a process caused by many irregular fluctuations or chance factors that cannot be anticipated, detected, identified, or eliminated
  • Determinism
    All events are completely determined by previously existing causes
  • Discrete uniform distribution
    A symmetric probability distribution where a finite number of values of X are equally likely to be observed
  • Likelihood
    The probability distributions can calculate the likelihood of a value
  • Probability distribution
    The function of variable X, evaluated at x, is the probability that X will take a value equal to x
  • Geometric distribution
    The probability distribution of the number of trials needed to get one success with a probability of p
  • Bayes theorem
    Calculates conditional probabilities and combines subjective or prior knowledge with objective current info to derive meaningful outcomes
  • Certainty is usually unjustified, but uncertainty makes us uncomfortable
  • Probability
    Can take values between 0 and 1, where 0 is impossible and 1 is certain
  • Stochastic thinking
    Involves probability
  • Conditional probability
    The measure of the probability of an event occurring, given that another event has already occurred
  • Random variable
    Value is unknown or a function assigns the value
  • Discrete variable
    A variable with a finite range, usually integer counts
  • There are two alternatives in the Geometric distribution: one deals with the number of trials and the other deals with the number of failures
  • Bernoulli distribution
    The probability distribution of a random variable which takes the value "1" with probability p and the value "0" with probability q=1-p
  • Cumulative distribution
    The function of variable X, evaluated at x, is the probability that X will take a value less than or equal to x
  • Natural or unnatural phenomena usually have random variation
  • Probability
    The proportion of times an event occurs in a long run sequence or number of trials
  • Independence of events
    Events are independent if the occurrence of one does not affect the probability of occurrence of the other
  • Conditional independence
    Two random events A and B are conditionally independent given a third event C
  • Events can occur multiple times (N times)
  • The world is possibly inherently unpredictable, and we do not have all the knowledge to make accurate predictions
  • Cause will always have an effect
  • Discrete distributions
    • Discrete uniform distribution
    • Bernoulli distribution
    • Binomial distribution
    • Geometric distribution
  • Random variable
    Can be either discrete or continuous
  • Random variation
    The sum of many small variations inherent in a process, which cannot be tracked back to a root cause
  • Binomial distribution
    The probability distribution of the number of successes in a sequence of n independent experiments with probability p
  • Continuous variable
    A variable that can take infinitely many values within some interval of numbers
  • Geometric distribution

    • Deals with number of trials
    • Deals with number of failures
    • Useful for assessing reliability and survival analysis
  • Continuous uniform distribution
    Symmetric probability distribution describing an experiment where outcomes lie between certain boundaries
  • Log-normal distribution
    Continuous probability distribution of a random variable whose logarithm is normally distributed, useful for variables that cannot be negative