› Forums › General questions › Acoustic Model and Language Model
This topic contains 1 reply, has 2 voices, and was last updated by Simon 2 weeks, 1 day ago.

AuthorPosts

November 7, 2017 at 15:41 #8283
I don’t think I understand what these are.
From Jurafsky and Martin, I read that these models calculate some probabilities. The acoustic model has something to do with the input waveforms. The language model predicts how likely a word is in a sentence?
Please can someone clearly explain what these models represent.

November 7, 2017 at 21:47 #8285
The best route to understanding this is first to understand Bayes’ rule.
If W is a word sequence and O is the observed speech signal:
The language model represents our prior beliefs about what sequences of words are more or less likely. We say “prior” because this is knowledge that we have before we even hear (or “observe”) any speech signal. The language model computes P(W). Notice that O is not involved.
When using a generative model, such as an HMM, as the acoustic model, it computes the likelihood of the observed speech signal, given a possible word sequence – this is called the likelihood and is written P(OW).
Neither of those quantities are what we actually need, if we are trying to decide what was said. We actually want to calculate the probability of every possible word sequence (so we can choose the most probable one), given the speech signal. This quantity is called the posterior, because we can only know its value after observing the speech, and is written P(WO).
Bayes’ rule tells us how we can combine the prior and the likelihood to calculate the posterior – or at least something proportional to it, which is good enough for our purposes of choosing the value of W that maximises P(WO).
You might think this is rather abstract and conceptually hard. You’d be right. Developing both an intuitive and formal understanding of probabilistic modelling takes some time.

AuthorPosts
You must be logged in to reply to this topic.