Most methods for estimating F0 start from autocorrelation. The idea is pretty simple: we are just looking for a repeating pattern in the waveform, which corresponds to the periodic vocal fold activity.

For some waveforms, it might be possible to do that directly in the time domain, but in general that doesn’t work very well. Instead, we compute the autocorrelation (“self similarity”) function, and look for a peak value in that.

After watching the video, download theĀ spreadsheet which shows the calculations and plots from this video.