SpringerOpen Newsletter

Receive periodic news and updates relating to SpringerOpen.

This article is part of the series Anthropomorphic Processing of Audio and Speech.

Open Access Research Article

Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

Christian Feldbauer1*, Gernot Kubin1 and W Bastiaan Kleijn2

Author Affiliations

1 Signal Processing and Speech Communication Laboratory, Graz University of Technology, Graz 8010, Austria

2 Department for Signals, Sensors and Systems, KTH (Royal Institute of Technology), Stockholm 10044, Sweden

For all author emails, please log on.

EURASIP Journal on Advances in Signal Processing 2005, 2005:571618  doi:10.1155/ASP.2005.1334


The electronic version of this article is the complete one and can be found online at: http://asp.eurasipjournals.com/content/2005/9/571618


Received:14 November 2003
Revisions received:25 August 2004
Published:21 June 2005

© 2005 Feldbauer et al.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel) coding.

Keywords:
speech and audio coding; auditory representation; auditory model inversion; auditory synthesis; perceptual domain coding; multiple description coding

Research Article