speech.zone

Simon October 31, 2015

The speed of sound

At the Parque de las Ciencias in Granada, Spain there is this long tube, open at the end nearest you and closed at the far end.

We can calculate the length of this tube just from the audio recording, because we know the speed of sound.

Here’s the waveform of part of the recording, showing one handclap followed by the sound of the echo. I’ve removed the time axis.

And here’s part of the audio track from the video for you to download (right click, save as…) and open in Wavesurfer or your favourite editor.

Now try to do the calculation yourself (hint: the sound of the handclap has to travel to the far end of the tube, be reflected, then come all the way back). You can assume the speed of sound is 337 metres per second.

Filed Under: Signals Tagged With: video, Wavesurfer

Simon October 11, 2014

Classification and regression trees (CART)

A quick introduction to a very simple but widely-applicable model that can perform classification (predicting a discrete label) or regression (predicting a continuous value). The tree is learned from labelled data, using supervised learning. Before watching this video, you might want to check that you understand what Entropy is.

Filed Under: Models Tagged With: Classification, Decision tree, Learning decision trees, supervised learning, video

Simon October 11, 2014

Sampling and quantisation

Is digital better than analogue? Here we discover that there are limitations when storing waveforms digitally. We learn that the consequence of sampling at a fixed rate is an upper limit on the frequencies that can be represented, called the Nyquist frequency. In addition to the limitations of sampling, storing each sample of the waveform as a […]

Filed Under: Signals Tagged With: Digital signal, video, Wavesurfer

Simon February 6, 2012

My inaugural lecture

I talk about how speech synthesis works, in what I hope is a non-technical and accessible way, and finish off with an application of speech synthesis that gives personalised voices to people who are losing the ability to speak. I also try to mention bicycles as many times as possible. For a more up-to-date, slightly more technical, […]

Filed Under: Synthesis Tagged With: lecture, video

Simon October 11, 2014

Pipeline architecture for TTS

Most text-to-speech systems split the problem into two main stages. The first stage is called the front end and contains many separate processes which gradually build up a linguistic specification from the input text. The second stage typically uses language-independent techniques (although they still require a language-specific speech corpus) to generate a waveform. Here we see those two […]

Filed Under: Synthesis Tagged With: front end, video, waveform generation

Simon October 11, 2014

A simple synthetic vowel

Using Praat, we synthesise a simple vowel-like sound, starting with a pulse train, which we pass through a filter with resonant peaks.

Filed Under: Synthesis Tagged With: Harmonics, praat, Source-filter model, Spectral envelope, video

Simon November 15, 2014

Token passing

Token passing is a really nice way to understand (and even to implement) Viterbi search for Hidden Markov Models. Here we see token passing in action, and you can look at the spreadsheet to see the calculations. To keep things simple, we are ignoring transition probabilities in this example. It would be simple to add them […]

Filed Under: Models, Recognition Tagged With: HMMs, spreadsheet, video

Simon October 11, 2014

Spectrum and spectrogram

The spectrum and the spectrogram are much more useful ways of analysing speech signals than the waveform. We look at how to create them using Wavesurfer and what effect the analysis window size has on what we see.

Filed Under: Signals Tagged With: Frequency domain, video, Wavesurfer

Simon November 23, 2014

The Gaussian probability density function: understanding the equation

The equation for the Gaussian probability density function looks a little scary at first, but this video should help you understand what each of the terms is doing, and how they fit together. After watching the video download the spreadsheet which shows the calculations and plots from this video (tip: the Apple Numbers.app version includes images […]

Filed Under: Probability Tagged With: equations, Gaussian, spreadsheet, video

Simon January 2, 2016

Interactive unit selection

Just a toy demo, but should give you some idea of how unit selection waveform generation works. Click with your mouse to choose a candidate diphone from each column, then the corresponding synthesised waveform will appear. You can click on the synthesised waveform to hear it again. Try to obtain the most natural-sounding synthesis by […]

Filed Under: Synthesis Tagged With: interactive, waveform generation

Simon October 25, 2014

A super-simple speech recogniser

We make what is possibly the world’s simplest speech recognition system. It can only recognise two different words, but will help you understand the basic idea of pattern recognition using template matching. The templates are just pre-recorded words, with known labels. The features extracted are just two formant frequencies in the middle of the word, […]

Filed Under: Recognition Tagged With: Classification, equations, Wavesurfer

Simon November 1, 2022

Bitrate

The bitrate (or bit rate) of a signal is the number of bits required to store, or transmit, 1 s of that signal. A bit is a binary number: either 0 or 1. Let’s calculate the bitrate of a digital waveform. First you should revise the concepts of sampling and quantisation from this module of the […]

Filed Under: Signals Tagged With: Digital signal

The speed of sound

Classification and regression trees (CART)

Sampling and quantisation

My inaugural lecture

Pipeline architecture for TTS

A simple synthetic vowel

Token passing

Spectrum and spectrogram

The Gaussian probability density function: understanding the equation

Interactive unit selection

A super-simple speech recogniser

Bitrate

Search this site

Posts

Latest Activity