Abstract:
In this article, an overview is given on synthesis concepts for the generation of singing voice signals. The general concept of approaches that rely on the source-filter decomposition of the voice production process and sample-based voice synthesis algorithms are briefly explained. The focus is then turned to the description of the human voice as the joint product of several functional components. The voice generation process is characterised as a combination of these model components: a multiple-mass vocal fold model, noise generation due to vortex shedding, wave propagation in a wave-guide through the vocal tract, and radiation at the mouth. The interaction between these components can take into account non-linear feedback effects which cannot be modelled by using a classical source-filter model or a sample-based synthesizer. A time-domain program is described that implements these models. As an example of the application of this model, the synthesis of overtone singing and generation of pathologic vocal fold movement due to singer's nodules is demonstrated.