Blick ins Buch

Automatic Speech Recognition on Mobile Devices and over Communication Networks (eBook)

Boerge Lindberg, Zheng-Hua Tan (Herausgeber)

eBook Download: PDF

2008 | 2008
XX, 402 Seiten
Springer London (Verlag)
978-1-84800-143-5 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

The advances in computing and networking have sparked an enormous interest in deploying automatic speech recognition on mobile devices and over communication networks. This book brings together academic researchers and industrial practitioners to address the issues in this emerging realm and presents the reader with a comprehensive introduction to the subject of speech recognition in devices and networks. It covers network, distributed and embedded speech recognition systems.

In the last decade the remarkable advances in computing and networking have sparked an enormous interest in deploying automatic speech recognition in devices and networks, and the trend is accelerating.This book brings together leading academic researchers and industrial practitioners to address the issues in this emerging realm. It covers networked, distributed and embedded speech recognition systems, which are expected to co-exist in the future. The book is divided into four parts: networked speech recognition, distributed speech recognition, embedded speech recognition, and systems and applications. A profound and unified introduction about this area and its latest development is provided, as well as working knowledge needed for research and practical application deployment. This book covers the most up-to-date standards and a number of systems.This all-inclusive reference is an essential read for graduate students, scientists and engineers working or researching in the field of speech recognition and processing.

Preface 6
Contents 10
Contributors 20
1 Network, Distributed and Embedded Speech Recognition: An Overview 22
1.1 Introduction 22
1.2 ASR and Its Deployment in Devices and Networks 24
1.3 Network Speech Recognition 30
1.4 Distributed Speech Recognition 32
1.5 Embedded Speech Recognition 36
1.6 Discussion 41
References 42
Part I Network Speech Recognition 46
2 Speech Coding and Packet Loss Effects on Speech and Speaker Recognition 48
2.1 Introduction 48
2.2 Sources of Degradation in Network Speech Recognition 49
2.3 Effects on the Automatic Speech Recognition Task 53
2.4 Effect for the Automatic Speaker Verification Task 56
2.5 Conclusion 59
Acknowledgments 59
References 60
3 Speech Recognition ver Mobile Networks 62
3.1 Introduction 62
3.2 Techniques for Improving ASR Performance ver Mobile Networks 64
3.3 Bitstream-Based Approach 67
3.4 Feature Transform 71
3.5 Enhancement of ASR Performance ver Mobile Networks O 74
3.6 Conclusion 78
References 79
4 Speech Recognition Over IP Networks 84
4.1 Introduction 84
4.2 Speech Recognition and IP Networks 86
4.3 Robustness Against Packet Loss 90
4.4 Speech Coder for Speech Recognition Over IP Networks 92
4.5 Conclusion 103
References 103
Part II Distributed Speech Recognition 106
5 Distributed Speech Recognition Standards 108
5.1 Introduction 108
5.2 Overview of the Set of DSR Standards 110
5.3 Scope of the Standards 111
5.4 DSR Basic Front-End ES 201 108 115
5.5 DSR Advanced Front-End ES 202 050 117
5.6 Recognition Performance of the DSR Front-Ends 118
5.7 3GPP Evaluations and Comparisons to AMR Coded Speech 120
5.8 ETSI DSR Extended Front-End Standards ES 202 211 and ES 202 212 123
5.9 Transport Protocols: The IETF RTP Payload Formats for DSR 125
5.10 Conclusion 126
Acknowledgments 126
References 126
6 Speech Feature Extraction and Reconstruction 128
6.1 Introduction 128
6.2 Feature Extraction 130
6.3 Speech Reconstruction 138
6.4 Prediction of Voicing and Fundamental Frequency 144
6.5 Conclusion 150
References 150
7 Quantization of Speech Features: Source Coding 152
7.1 Introduction 152
7.2 Quantization Schemes 153
7.3 Quantization of ASR Feature Vectors 162
7.4 Experimental Results 174
7.5 Conclusion 179
References 180
8 Error Recovery: Channel Coding and Packetization 184
8.1 Distributed Speech Recognition Systems 184
8.2 Characterization and Modeling of Communication Channels 185
8.3 Media-Specific FEC 188
8.4 Media-Independent FEC 189
8.5 Unequal Error Protection 197
8.6 Frame Interleaving 198
8.7 Examples of Modern Error Recovery Standards 202
8.8 Summary 204
Acknowledgments 205
References 205
9 Error Concealment 208
9.1 Introduction 208
9.2 Speech Recognition in the Presence of Corrupted Features 211
9.3 Feature Posterior Estimation in a DSR Framework 215
9.4 Performance Evaluations 223
9.5 Conclusion 228
Acknowledgments 229
References 229
Part III Embedded Speech Recognition 232
10 Algorithm Optimizations: Low Computational Complexity 234
10.1 Introduction 234
10.2 Common Limitations of Embedded Platforms 235
10.3 Overview of an ASR System 236
10.4 Front End 237
10.5 Observation Model 238
10.6 Search 242
10.7 Conclusion 250
Acknowledgments 250
References 251
11 Algorithm Optimizations: Low Memory Footprint 254
11.1 Introduction 254
11.2 Notations and Problem Statement 255
11.3 Model Complexity Control 258
11.4 Parameter Tying 260
11.5 Parameter Representations 264
11.6 Quantized Parameters HMMs 266
11.7 Subspace Distribution Clustering HMM 268
11.8 Computational Complexity Implications 270
11.9 Practicalities and Conclusion 271
References 272
12 Fixed-Point Arithmetic 276
12.1 Introduction 276
12.2 Fixed-Point Arithmetic 278
12.3 LVCSR MAP Recognizer 280
12.4 Fixed-Point Implementation of the Recognizer 285
12.5 Experiments 290
12.6 Conclusion 295
Acknowledgments 295
References 295
Part IV Systems and Applications 298
13 Software Architectures for Networked Mobile Speech Applications 300
13.1 Introduction 300
13.2 Classes of Multimodal Architectures 309
13.3 The “Plus V” Distributed Multimodal Architecture 314
13.4 Other Distributed Multimodal Architectures 316
13.5 Toward a Commercial Ecosystem 318
13.6 Conclusion 319
References 319
14 Speech Recognition in Mobile Phones 322
14.1 Introduction 322
14.2 Applications of Speech Recognition for Mobile Phones 323
14.3 Multilinguality and Language Support 326
14.4 Noise Robustness 330
14.5 Footprint and Complexity Reduction 335
14.6 Platforms and an Example Application 340
14.7 Conclusion and Outlook 344
References 344
15 Handheld Speech to Speech Translation System 348
15.1 Introduction 348
15.2 System Overview 349
15.3 System Components and Optimization 353
15.4 Experiments and Discussions 362
15.5 Conclusion 365
References 366
16 Automotive Speech Recognition 368
16.1 Introduction 368
16.2 Siemens Speech Processing—From Research to Products 369
16.3 Example Automotive Voice Applications: Infotainment, Navigation, Manuals, and Internet 372
16.4 Automotive Platform Issues and Challenges 378
16.5 Noise Robust Recognition Technology 381
16.6 Methodology for Evaluation of Automotive Recognizers Quality Measurement Using SNR Curves 388
16.7 Conclusion 393
References 393
17 Energy Aware Speech Recognition for Mobile Devices 396
17.1 Introduction 396
17.2 Case Study of Distributed Speech Recognition Using the HP Labs Smartbadge System 400
17.3 Conclusion 416
References 416
Index 418

Erscheint lt. Verlag	17.4.2008
Reihe/Serie	Advances in Computer Vision and Pattern Recognition
Zusatzinfo	XX, 402 p.
Verlagsort	London
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik ► Betriebssysteme / Server
	Mathematik / Informatik ► Informatik ► Netzwerke
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
	Mathematik / Informatik ► Informatik ► Web / Internet
	Technik ► Elektrotechnik / Energietechnik
Schlagworte	algorithms • Architecture • Cognition • Communication • Complexity • Distributed Systems • Embedded Systems • handheld devices • HCI • Internet • mobile environments • Optimization • Speech Recognition • Standards • wireless networks
ISBN-10	1-84800-143-6 / 1848001436
ISBN-13	978-1-84800-143-5 / 9781848001435

Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 3,9 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

160,49 €