SpringerOpen Newsletter

Receive periodic news and updates relating to SpringerOpen.

This article is part of the series Joint Audio-Visual Speech Processing.

Open Access Research Article

A Support Vector Machine-Based Dynamic Network for Visual Speech Recognition Applications

Mihaela Gordan*, Constantine Kotropoulos and Ioannis Pitas

Author Affiliations

Department of Informatics, Aristotle University of Thessaloniki, Box 451, Thessaloniki 54006, Greece

For all author emails, please log on.

EURASIP Journal on Advances in Signal Processing 2002, 2002:427615  doi:10.1155/S1110865702207039

The electronic version of this article is the complete one and can be found online at: http://asp.eurasipjournals.com/content/2002/11/427615


Received:26 November 2001
Revisions received:26 July 2002
Published:28 November 2002

© 2002 Gordan et al.

Visual speech recognition is an emerging research field. In this paper, we examine the suitability of support vector machines for visual speech recognition. Each word is modeled as a temporal sequence of visemes corresponding to the different phones realized. One support vector machine is trained to recognize each viseme and its output is converted to a posterior probability through a sigmoidal mapping. To model the temporal character of speech, the support vector machines are integrated as nodes into a Viterbi lattice. We test the performance of the proposed approach on a small visual speech recognition task, namely the recognition of the first four digits in English. The word recognition rate obtained is at the level of the previous best reported rates.

Keywords:
visual speech recognition; mouth shape recognition; visemes; phonemes; support vector machines; Viterbi lattice

Research Article