Contemporary Methods for Speech Parameterization - Todor Ganchev

Contemporary Methods for Speech Parameterization (eBook)

(Autor)

eBook Download: PDF
2011 | 2011
X, 114 Seiten
Springer New York (Verlag)
978-1-4419-8447-0 (ISBN)
Systemvoraussetzungen
50,28 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

Contemporary Methods for Speech Parameterization offers a general view of short-time cepstrum-based speech parameterization and provides a common ground for further in-depth studies on the subject. Specifically, it offers a comprehensive description, comparative analysis, and empirical performance evaluation of eleven contemporary speech parameterization methods, which compute short-time cepstrum-based speech features.

Among these are five discrete wavelet packet transform (DWPT)-based, six discrete Fourier transform (DFT)-based speech features and some of their variants which have been used on the speech recognition, speaker recognition, and other related speech processing tasks. The main similarities and differences in their computation are discussed and empirical results from performance evaluation in common experimental conditions are presented. The recognition accuracy obtained on the monophone recognition, continuous speech recognition and speaker recognition tasks is contrasted against the one obtained for the well-known and widely used Mel Frequency Cepstral Coefficients (MFCC).

It is shown that many of these methods lead to speech features that do offer competitive performance on a certain speech processing setup when compared to the venerable MFCC. The last does not target the promotion of certain speech features but instead aims to enhance the common understanding about the advantages and disadvantages of the various speech parameterization techniques available today and to provide the basis for selection of an appropriate speech parameterization in each particular case.


Contemporary Methods for Speech Parameterization offers a general view of short-time cepstrum-based speech parameterization and provides a common ground for further in-depth studies on the subject. Specifically, it offers a comprehensive description, comparative analysis, and empirical performance evaluation of eleven contemporary speech parameterization methods, which compute short-time cepstrum-based speech features. Among these are five discrete wavelet packet transform (DWPT)-based, six discrete Fourier transform (DFT)-based speech features and some of their variants which have been used on the speech recognition, speaker recognition, and other related speech processing tasks. The main similarities and differences in their computation are discussed and empirical results from performance evaluation in common experimental conditions are presented. The recognition accuracy obtained on the monophone recognition, continuous speech recognition and speaker recognition tasks is contrasted against the one obtained for the well-known and widely used Mel Frequency Cepstral Coefficients (MFCC). It is shown that many of these methods lead to speech features that do offer competitive performance on a certain speech processing setup when compared to the venerable MFCC. The last does not target the promotion of certain speech features but instead aims to enhance the common understanding about the advantages and disadvantages of the various speech parameterization techniques available today and to provide the basis for selection of an appropriate speech parameterization in each particular case.

Basic Concepts and Applicability of Speech Parameterization.- Survey on speech parameterization.- Fourier transform based methods.- Wavelet packets based methods.- Evaluation on the speech recognition task.- Evaluation on the speaker recognition task.- Practical considerations.- Links to code and further sources of information.

Erscheint lt. Verlag 10.8.2011
Reihe/Serie SpringerBriefs in Speech Technology
SpringerBriefs in Speech Technology
Zusatzinfo X, 114 p. 32 illus., 23 illus. in color.
Verlagsort New York
Sprache englisch
Themenwelt Informatik Software Entwicklung User Interfaces (HCI)
Informatik Theorie / Studium Künstliche Intelligenz / Robotik
Technik Elektrotechnik / Energietechnik
Schlagworte Cepstral coefficients • Fourier Transforms • Speaker Recognition • Speech parameterization • Speech Recognition • Wavelet Packets
ISBN-10 1-4419-8447-X / 144198447X
ISBN-13 978-1-4419-8447-0 / 9781441984470
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 2,4 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Eine praxisorientierte Einführung mit Anwendungen in Oracle, SQL …

von Edwin Schicker

eBook Download (2017)
Springer Vieweg (Verlag)
34,99
Unlock the power of deep learning for swift and enhanced results

von Giuseppe Ciaburro

eBook Download (2024)
Packt Publishing (Verlag)
35,99