I built my own talking-and-singing head. Here’s a demo:
Can you identify the raga of the song in this video?
To build this demo, I did the following:
- Recorded part of the ARCTIC database and built my own Festox voice.
- Built a random-music-composition tool, using some basic rules of Hindustani Classical Music.
- Built an image database of my lip shapes
- Built a tool to generate a video by combining the lip shape images based on phonemes and their durations predicted by the speech synthesizer.
- Merged the audio and video tracks and stream them out.
Tools used: Flite, ffmpeg and python.
There used to be a live demo of this tool up on a webpage, so you could make my talking head say anything, but it has been down for some time now. Maybe at some point I’ll bring it back up.