Part 5 : Learning bayesian networks

Subdecks (3)

Cards (61)

  • Manual construction of model for a given problem may be impossible :
    • export not aivable
    • problem too complex

    goal : construct a structured model of the (hidden) distribution most likely underlying the oberserved samples (Automatic Model Learning)
  • Automatic model learning assumptions :
    • unkown distribution
    • training examples are representative of the world
    task : learn the model with a distribution that is an approxamtion to the "training set" model and with a graph structure that reflects the true (in)dependencies in the world
  • Learning as optimisation general approach :
    • define an objective function F(M,D) : a measure that estimates how "good" a given model M is in relation to the given training examples
    • develop an algortihm to find the model that maximes F
    learning is a search/optimisation problem
  • Likelihood of a Model M
    relative to a dataset D is the probability that the model assigns to the set D : L(M:D)=L(M:D) =PM(D) P_M (D)
  • If the examples D are independent and identically distributed (i.i.d), the likelihood L(M:D) is L(M:D)=L(M:D) =PM(D)= P_M(D) =ΠxiDPM(xi) \Pi_{x_i\in D} P_M (x_i)
  • Likelihood : is the product of the probabilities assigned by the model to the individual training examples
  • Problems with the likelihood function :
    • probability will be miniscule
    • arithmetic underflow
    solution : log-likelihood
  • The Log-likelihood l(M,D) of a Model M relative to a dataset D is the logarithm of the likelihood
    l(M:D)=l(M:D) =logL(M:D)= log L(M:D) = logΠxiDPM(xi)log \Pi_{x_i \in D} P_M (x_i) = xiDlogPM(xi)\sum_{x_i \in D} log P_M(x_i)
  • Likelihood and log-likelihood are monotoncially related : l(M: D) has its maximum where L(M:D) is maximal
  • to compensate for overfitting --> a model that generalises
  • Generalisation : model must be more general than simple summary of a training set
    Overfitting : model that exactly fits the training data, but not usefull for queries about new sitiuations
  • Bias : error possibility introduced by restricting expressivity of model class
  • Bias vs Variance
    Put contraints on calss of models allowed to the learned
    • hard constraint : strictly resticts the class of models
    • soft contraints : an additional regularisation term to the objective function that adds a penalty
  • Variance : error possiblity introduced by permitting high expressivity of model class
  • Bias-Variance tradeoff
    • restriction to simple models make hypothesis space smaller and increases the probaility of bias error
    • on the other hand, in a smaller hypothesis space, it is less likely to find an overfitting model
    vs .
    • permitting complex models reduces probability of bias error
    • but introduces variance as a potential source of error