Speech Recognition Algorithms Using Weighted Finite-State Transducers - Takaaki Hori, Atsushi Nakamura

Speech Recognition Algorithms Using Weighted Finite-State Transducers

Buch | Softcover
XII, 150 Seiten
2012
Springer International Publishing (Verlag)
978-3-031-01434-5 (ISBN)
37,44 inkl. MwSt
This book introduces the theory, algorithms, and implementation techniques for efficient decoding in speech recognition mainly focusing on the Weighted Finite-State Transducer (WFST) approach. The decoding process for speech recognition is viewed as a search problem whose goal is to find a sequence of words that best matches an input speech signal. Since this process becomes computationally more expensive as the system vocabulary size increases, research has long been devoted to reducing the computational cost. Recently, the WFST approach has become an important state-of-the-art speech recognition technology, because it offers improved decoding speed with fewer recognition errors compared with conventional methods. However, it is not easy to understand all the algorithms used in this framework, and they are still in a black box for many people. In this book, we review the WFST approach and aim to provide comprehensive interpretations of WFST operations and decoding algorithms to help anyone who wants to understand, develop, and study WFST-based speech recognizers. We also mention recent advances in this framework and its applications to spoken language processing. Table of Contents: Introduction / Brief Overview of Speech Recognition / Introduction to Weighted Finite-State Transducers / Speech Recognition by Weighted Finite-State Transducers / Dynamic Decoders with On-the-fly WFST Operations / Summary and Perspective

Takaaki Hori received the B.E. and M.E. degrees in electrical and information engineering from Yamagata University, Yonezawa, Japan, in 1994 and 1996, respectively, and a Ph.D. degree in system and information engineering from Yamagata University in 1999. Since 1999, he has been engaged in research on spoken language processing at the Cyber Space Laboratories, Nippon Telegraph, and Telephone (NTT) Corporation, Kyoto, Japan. He was a visiting scientist at the Massachusetts Institute of Technology, Cambridge, from 2006 to 2007. He is currently a senior research scientist in the NTT Communication Science Laboratories, NTT Corporation. He received the 22nd Awaya Prize Young Researcher Award from the Acoustical Society of Japan (ASJ) in 2005, the 24th TELECOM System Technology Award from the Telecommunications Advancement Foundation in 2009, and the IPSJ Kiyasu Special Industrial Achievement Award from the Information Processing Society of Japan in 2012. He is a member of Institute of Electrical and Electronic Engineers (IEEE), the Institute of Electronics, Information, and Communication Engineers (IEICE), and the ASJ.Takaaki Hori received the B.E. and M.E. degrees in electrical and information engineering from Yamagata University, Yonezawa, Japan, in 1994 and 1996, respectively, and a Ph.D. degree in system and information engineering from Yamagata University in 1999. Since 1999, he has been engaged in research on spoken language processing at the Cyber Space Laboratories, Nippon Telegraph, and Telephone (NTT) Corporation, Kyoto, Japan. He was a visiting scientist at the Massachusetts Institute of Technology, Cambridge, from 2006 to 2007. He is currently a senior research scientist in the NTT Communication Science Laboratories, NTT Corporation. He received the 22nd Awaya Prize Young Researcher Award from the Acoustical Society of Japan (ASJ) in 2005, the 24th TELECOM System Technology Award from the Telecommunications Advancement Foundation in 2009, and the IPSJ Kiyasu Special Industrial Achievement Award from the Information Processing Society of Japan in 2012. He is a member of Institute of Electrical and Electronic Engineers (IEEE), the Institute of Electronics, Information, and Communication Engineers (IEICE), and the ASJ.

Introduction.- Brief Overview of Speech Recognition.- Introduction to Weighted Finite-State Transducers.- Speech Recognition by Weighted Finite-State Transducers.- Dynamic Decoders with On-the-fly WFST Operations.- Summary and Perspective.

Erscheinungsdatum
Reihe/Serie Synthesis Lectures on Speech and Audio Processing
Zusatzinfo XII, 150 p.
Verlagsort Cham
Sprache englisch
Maße 191 x 235 mm
Gewicht 321 g
Themenwelt Technik Elektrotechnik / Energietechnik
ISBN-10 3-031-01434-0 / 3031014340
ISBN-13 978-3-031-01434-5 / 9783031014345
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
DIN-Normen und Technische Regeln für die Elektroinstallation

von DIN; ZVEH; Burkhard Schulze

Buch | Softcover (2023)
Beuth (Verlag)
86,00
Wegweiser für Elektrofachkräfte

von Gerhard Kiefer; Herbert Schmolke; Karsten Callondann

Buch | Hardcover (2024)
VDE VERLAG
48,00