Speech Recognition Algorithms Using Weighted Finite-State Transducers (eBook)

eBook Download: PDF
2022
XII, 150 Seiten
Springer International Publishing (Verlag)
978-3-031-02562-4 (ISBN)

Lese- und Medienproben

Speech Recognition Algorithms Using Weighted Finite-State Transducers - Takaaki Hori, Atsushi Nakamura
Systemvoraussetzungen
37,44 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
This book introduces the theory, algorithms, and implementation techniques for efficient decoding in speech recognition mainly focusing on the Weighted Finite-State Transducer (WFST) approach. The decoding process for speech recognition is viewed as a search problem whose goal is to find a sequence of words that best matches an input speech signal. Since this process becomes computationally more expensive as the system vocabulary size increases, research has long been devoted to reducing the computational cost. Recently, the WFST approach has become an important state-of-the-art speech recognition technology, because it offers improved decoding speed with fewer recognition errors compared with conventional methods. However, it is not easy to understand all the algorithms used in this framework, and they are still in a black box for many people. In this book, we review the WFST approach and aim to provide comprehensive interpretations of WFST operations and decoding algorithms to help anyone who wants to understand, develop, and study WFST-based speech recognizers. We also mention recent advances in this framework and its applications to spoken language processing. Table of Contents: Introduction / Brief Overview of Speech Recognition / Introduction to Weighted Finite-State Transducers / Speech Recognition by Weighted Finite-State Transducers / Dynamic Decoders with On-the-fly WFST Operations / Summary and Perspective

Takaaki Hori received the B.E. and M.E. degrees in electrical and information engineering from Yamagata University, Yonezawa, Japan, in 1994 and 1996, respectively, and a Ph.D. degree in system and information engineering from Yamagata University in 1999. Since 1999, he has been engaged in research on spoken language processing at the Cyber Space Laboratories, Nippon Telegraph, and Telephone (NTT) Corporation, Kyoto, Japan. He was a visiting scientist at the Massachusetts Institute of Technology, Cambridge, from 2006 to 2007. He is currently a senior research scientist in the NTT Communication Science Laboratories, NTT Corporation. He received the 22nd Awaya Prize Young Researcher Award from the Acoustical Society of Japan (ASJ) in 2005, the 24th TELECOM System Technology Award from the Telecommunications Advancement Foundation in 2009, and the IPSJ Kiyasu Special Industrial Achievement Award from the Information Processing Society of Japan in 2012. He is a member of Institute of Electrical and Electronic Engineers (IEEE), the Institute of Electronics, Information, and Communication Engineers (IEICE), and the ASJ.
Takaaki Hori received the B.E. and M.E. degrees in electrical and information engineering from Yamagata University, Yonezawa, Japan, in 1994 and 1996, respectively, and a Ph.D. degree in system and information engineering from Yamagata University in 1999. Since 1999, he has been engaged in research on spoken language processing at the Cyber Space Laboratories, Nippon Telegraph, and Telephone (NTT) Corporation, Kyoto, Japan. He was a visiting scientist at the Massachusetts Institute of Technology, Cambridge, from 2006 to 2007. He is currently a senior research scientist in the NTT Communication Science Laboratories, NTT Corporation. He received the 22nd Awaya Prize Young Researcher Award from the Acoustical Society of Japan (ASJ) in 2005, the 24th TELECOM System Technology Award from the Telecommunications Advancement Foundation in 2009, and the IPSJ Kiyasu Special Industrial Achievement Award from the Information Processing Society of Japan in 2012. He is a member of Institute of Electrical and Electronic Engineers (IEEE), the Institute of Electronics, Information, and Communication Engineers (IEICE), and the ASJ.
Erscheint lt. Verlag 31.5.2022
Reihe/Serie Synthesis Lectures on Speech and Audio Processing
Zusatzinfo XII, 150 p.
Sprache englisch
Original-Titel Speech Recognition Algorithms Based on Weighted Finite-State Transducers
Themenwelt Naturwissenschaften Physik / Astronomie
Technik Elektrotechnik / Energietechnik
ISBN-10 3-031-02562-8 / 3031025628
ISBN-13 978-3-031-02562-4 / 9783031025624
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich

von Horst Kuchling; Thomas Kuchling

eBook Download (2022)
Carl Hanser Verlag GmbH & Co. KG
24,99
Grundlagen - Verfahren - Anwendungen - Beispiele

von Jens Bliedtner

eBook Download (2022)
Carl Hanser Verlag GmbH & Co. KG
49,99