Hidden Markov Model (HMM) is a State-Observation Model with a single discrete state variable S and a single observation variable 0
HMM, the transition model often encodes a lot of structural constraints
-> resulting model is sparse
Graphical representation of transition model
A) 0.3
B) 0.5
C) 0.9
D) values
Hidden Markov Model (HMM)
A) state variable
B) observation variable
C) starting probabilities
D) state transition probabilities
E) observation probabilities
Simple example
A) 0.3
Simple example observation model
A) 0.6
B) state sequence
C) directly observable
D) sequence
Complete specification
A) complete specification
B) pi
C) 0.2
D) 0.4
HMMs : common reasoning tasks :
Filtering/tracking
Prediction
smoothing
decoding
computing the proability of an observation sequence
Filtering
A) h|r
B) r1|r0
C) r1|c0
D) f1|s0
Filter (2)
A) m|s
B) s2|r1
C) s2|c1
D) s2|s1
observation are writting like <humid, medium?>
Forward-backward algorithm :
A) initialise f
B) forward pass
C) initalise b
D) backward pass
Forward backward algorithms need for re-scaling (or work with logarithms)
Decoding : the most likely state sequence is not the same as the sequence of most likely states as computed in filtering or smoothing
Decoding naive approach -> problem : combinatorial number of possible state sequences
Viterbi Algorithm idea
A) forward
B) most likely way
C) back pointer
D) ending in state
E) probability
F) points
G) maximises
Computing the likelihood of a HMM will be used for classification
THe probability of the model producing a sequence is equal to the probability, over all possible final states of the model producing sequences and einding state (law of total probability)
HMMs come from :
structure is automatically defined
problem : find good parameters pi, A,B
model parameters must be learned
HMM parameter estimation is an optimisation problem
Learning HMMs from Obseration Sequences
A) hidden states
B) observation symbols
C) training set
D) Parameters
E) likelihood
Central problem in learning HMMs:
contains unobservable variables
cannot observed in training sequences
problem concerns all three sets of parameters
Summary of the dilemma :
if we knew the hidden counts, we could estimate the model parameters
if we knew the model parameters, we could estimate the hiddencounts
The General re-estimation procedure
A) random distributions
B) convergence
C) Expectation step
D) Maximisation Step
Computing the expected values of this hidden counts :
A) likelihood forward message
B) backward message
Baum-Welch is an instance of a very general algorithm schema for estimating hidden model parameters by iterating between :
aligning the training date to the current model
refitting the parameters of the model
Constrained transition model structures :
A) fully connected
B) left-to-right model permitting skips
C) parallel-path left-to-right model
If observations are vectors of feature values -> solutions : discretation or vector quantisation
vector quantisation : discretise continuous observations into discrete intervals and use these as values -> possible huge number of values + impractical to represent)
solution 2 : continuousprobability density models -> large error introduction
Gaussian Observation Models
A) seperate normal
B) state
The mutlivariate case
A) Multivariate
B) mean vector
C) covariance matrix
problem with the mutlivariate case :
highly unlikely that joint distribution of N variables will be well described by a single Gausian
Gausian mixture model : model joint condition distribution by a weighted combination of single Gaussians