Build your own unit selection voice

Record your speech and build a unit selection voice for Festival. Create variations of the voice, add domain specific data, or vary the database size. Evaluate with a listening test.

Tools required
Only needed if you are setting this exercise up on your own. Edinburgh students should skip this step.
Introduction
An overview of the complete process of voice building, and some tips for success.
Prepare your workspace
We're going to be generating quite a lot of different files, so we need a well-organised workspace in which to keep them.
Milestones
To keep on track, check your progress against these milestones. Try to stay ahead of them if you can.
The recording script
Because unit selection relies so heavily on the contents of the database, we need to think carefully about exactly what speech we should record.
Make the recordings
With our carefully chosen script, we now need to go into the recording studio and ask our voice talent to record it. Consistency is the key here, especially when the recording is done over multiple sessions.
Prepare the recordings
Move your recordings into the workspace, convert the waveforms to the right format, and do some sanity checking.
Label the speech
The labels are obtained from the text using the front-end of the text-to-speech system, but we then need to align them to the recorded speech using a technique borrowed from automatic speech recognition.
Pitchmark the speech
The signal processing used for waveform concatenation is pitch-syncronous, so that requires the speech database to have the individual pitch periods marked.
Build the voice
The final stages of building the voice involve creating the information needed by the target and join costs, plus the representation of the speech needed for waveform generation.
Run the voice
We're done! Time to find out what it sounds like...
Improvements and variations
It would take too long to tune every aspect of the system, but we can still identify some problems and see how to fix them. It's also easy to vary the contents of the database to discover the effect on the synthetic speech.
Evaluation
The main form of evaluation should be a listening test with multiple naive listeners. But there are other ways to evaluate, and potentially to improve, your voice.
Writing up
Because you kept such great notes in your logbook (didn't you?), writing up will be easy and painless.