Speech Processing

Starting with an introduction that makes no assumptions about background knowledge, followed by text-to-speech synthesis, and automatic speech recognition.

This course is taught at the University of Edinburgh as my Speech Processing course. It’s divided into three parts: the basics; speech synthesis; speech recognition.

  • What background do you need?

    I have found that students do better when they come from particular backgrounds.

  • Weekly schedule

    This is your guide to the topics that will be covered each week. It tells you what you need to do before each lecture. It also specifies the coursework deadlines.

  • Readings

    Arranged into lists by lecture and importance.

  • Foundation material

    Optional material to help you fill in gaps in your prior knowledge.

  • The basics

    We need to get the basics right before going any further, and that’s what this part is all about. What exactly is speech? How can we inspect it? What tools and methods do we need?

  • Speech synthesis

    Text-to-speech synthesis, including text processing, and waveform generation by concatenation of diphones.

  • Automatic speech recognition

    Automatic speech recognition using Hidden Markov Models and simple language models.