Publikationsansicht

Informatics and Mathematical Modelling (2008)

Abstract
In this paper a system that transforms speech waveforms to animated faces are proposed. The system relies on continuous state space models to perform the mapping, this makes it possible to ensure video with no sudden jumps and allows continuous control of the parameters in ’face space’. The performance of the system is critically dependent on the number of hidden variables, with too few variables the model cannot represent data, and with too many overfitting is noticed. Simulations are performed on recordings of 3-5 sec. video sequences with sentences from the Timit database. From a subjective point of view the model is able to construct an image sequence from an unknown noisy speech sequence even though the number of training examples are limited. 1.

Details der Publikation
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.97.8943
Quelle http://eprints.pascal-network.org/archive/00000162/01/pascal.pdf
Mitarbeiter CiteSeerX
Archiv CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Typ text
Sprache Englisch
Verknüpfungen 10.1.1.11.2062, 10.1.1.121.9421, 10.1.1.93.102, 10.1.1.25.6217, 10.1.1.116.7762, 10.1.1.50.2866