Data and Modeling

    Subdecks (9)

    Cards (1024)

    • It is important to distinguish random from systematic variation because it can help understand the processes
    • The problem of randomness cannot be eliminated, but it can be understood through probability and stochastic thinking
    • A causal relationship can explain the world completely
    • We can only say that something is likely to occur
    • Probability
      The fraction of the number of desired outcomes over all outcomes in an experiment or observation
    • Total probability

      The fundamental rule relating marginal probabilities to conditional probabilities
    • Joint probability
      The likelihood of two events occurring together and at the same point in time
    • Random variable
      A variable taking on numerical values determined by the outcome of a random phenomenon
    • Statistical analysis attempts to separate the signal in the data from the noise
    • Random variation
      Variability of a process caused by many irregular fluctuations or chance factors that cannot be anticipated, detected, identified, or eliminated
    • Determinism
      All events are completely determined by previously existing causes
    • Discrete uniform distribution
      A symmetric probability distribution where a finite number of values of X are equally likely to be observed
    • Likelihood
      The probability distributions can calculate the likelihood of a value
    • Probability distribution
      The function of variable X, evaluated at x, is the probability that X will take a value equal to x
    • Geometric distribution
      The probability distribution of the number of trials needed to get one success with a probability of p
    • Bayes theorem
      Calculates conditional probabilities and combines subjective or prior knowledge with objective current info to derive meaningful outcomes
    • Certainty is usually unjustified, but uncertainty makes us uncomfortable
    • Probability
      Can take values between 0 and 1, where 0 is impossible and 1 is certain
    • Stochastic thinking
      Involves probability
    • Conditional probability
      The measure of the probability of an event occurring, given that another event has already occurred
    • Random variable
      Value is unknown or a function assigns the value
    • Discrete variable
      A variable with a finite range, usually integer counts
    • There are two alternatives in the Geometric distribution: one deals with the number of trials and the other deals with the number of failures
    • Bernoulli distribution
      The probability distribution of a random variable which takes the value "1" with probability p and the value "0" with probability q=1-p
    • Cumulative distribution
      The function of variable X, evaluated at x, is the probability that X will take a value less than or equal to x
    • Natural or unnatural phenomena usually have random variation
    • Probability
      The proportion of times an event occurs in a long run sequence or number of trials
    • Independence of events
      Events are independent if the occurrence of one does not affect the probability of occurrence of the other
    • Conditional independence
      Two random events A and B are conditionally independent given a third event C
    • Events can occur multiple times (N times)
    • The world is possibly inherently unpredictable, and we do not have all the knowledge to make accurate predictions
    • Cause will always have an effect
    • Discrete distributions
      • Discrete uniform distribution
      • Bernoulli distribution
      • Binomial distribution
      • Geometric distribution
    • Random variable
      Can be either discrete or continuous
    • Random variation
      The sum of many small variations inherent in a process, which cannot be tracked back to a root cause
    • Binomial distribution
      The probability distribution of the number of successes in a sequence of n independent experiments with probability p
    • Continuous variable
      A variable that can take infinitely many values within some interval of numbers
    • Geometric distribution

      • Deals with number of trials
      • Deals with number of failures
      • Useful for assessing reliability and survival analysis
    • Continuous uniform distribution
      Symmetric probability distribution describing an experiment where outcomes lie between certain boundaries
    • Log-normal distribution
      Continuous probability distribution of a random variable whose logarithm is normally distributed, useful for variables that cannot be negative