Blick ins Buch

Spoken Dialogue Systems Technology and Design (eBook)

Gary Geunbae Lee, Joseph Mariani, Wolfgang Minker, Satoshi Nakamura (Herausgeber)

eBook Download: PDF

2010 | 2011
XXIII, 277 Seiten
Springer New York (Verlag)
978-1-4419-7934-6 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

Spoken Dialogue Systems Technology and Design covers key topics in the field of spoken language dialogue interaction from a variety of leading researchers. It brings together several perspectives in the areas of corpus annotation and analysis, dialogue system construction, as well as theoretical perspectives on communicative intention, context-based generation, and modelling of discourse structure. These topics are all part of the general research and development within the area of discourse and dialogue with an emphasis on dialogue systems; corpora and corpus tools and semantic and pragmatic modelling of discourse and dialogue.

Preface 5
Contents 10
Contributing Authors 14
Chapter 1 MULTILINGUAL SPEECH INTERFACESFOR RESOURCE-CONSTRAINEDDIALOGUE SYSTEMS 23
1. Introduction 24
2. Literature Review 25
2.1 Review of Multilingual Speech Recognition 25
2.2 Review of Non-Native Speech Recognition 26
3. Approach 27
4. Experimental Setup 27
4.1 Training and Test Data 27
4.2 Benchmark System 29
5. Accent Adaptation 29
5.1 Monophones vs. Triphones 29
5.2 Multilingual MWC System 30
5.3 Model Merging 33
5.4 Adaptation with Non-Native Speech 35
6. Scalable Architecture 37
6.1 Projections between GMMs 37
6.2 Scalable Architecture 42
6.3 Footprint 45
7. Summary 46
Notes 47
Chapter 2 ONLINE LEARNING OF BAYESRISK-BASED OPTIMIZATION OFDIALOGUE MANAGEMENTFOR DOCUMENT RETRIEVALSYSTEMS WITH SPEECHINTERFACE 51
1. Introduction 52
2. Dialogue Management and ResponseGeneration in Document Retrieval System 54
2.1 System Overview 54
2.2 Knowledge Base (KB) 55
2.3 Backend Retrieval System 56
2.4 Backend Question-Answering System 57
2.5 Use of N-Best Hypotheses of ASR andContextual Information for GeneratingResponses 58
2.6 Field Test of Trial System 58
3. Optimization of Dialogue Management inDocument Retrieval System 59
3.1 Choices in Generating Responses 59
3.2 Optimization of Responses based on BayesRisk 59
3.3 Generation of Response Candidates 60
3.4 Definition of Bayes Risk for CandidateResponse 61
3.5 Confidence Measure of InformationRetrieval and Question-Answering 64
4. Online Learning of Bayes Risk-basedDialogue Management 64
4.1 Parameter Optimization by Online Learning 64
4.2 Optimization using Maximum LikelihoodEstimation 65
4.3 Optimization using Steepest Descent 66
4.4 Online Learning Method usingReinforcement Learning 66
5. Evaluation of Online Learning Methods 68
6. Conclusions 70
Notes 70
References 71
Chapter 3 TOWARDS FINE-GRAINUSER-SIMULATION FORSPOKEN DIALOGUE SYSTEMS 75
1. Introduction 76
2. Related Work 78
2.1 Rule-based User Simulators 79
2.2 Corpus-based User Simulators 81
2.3 Hybrid User Simulators 85
2.4 Evaluation of User Simulators 85
2.4.1 Direct Methods. 86
2.4.2 Indirect Methods. 87
3. Our User Simulators 88
3.1 The Initial User Simulator 88
3.2 The Enhanced User Simulator 90
4. Experiments 91
4.1 Speech Database and Scenario Corpus 92
4.2 Language Models for Speech Recognition 93
5. Results 93
5.1 Detection of Problems in the Performanceof the Dialogue System 94
5.1.2 Findings for the Medium Cooperativeness Level. 95
5.1.3 Findings for the Low Cooperativeness Level. 97
5.2 Future Work 98
6. Conclusions 98
Acknowledgments 99
Chapter 4 SALIENT FEATURES FOR ANGERRECOGNITION IN GERMAN ANDENGLISH IVR PORTALS 104
1. Introduction 105
2. Related Work 106
3. Overview of Database Conditions 106
4. Selected Corpora 108
5. Prosodic and Acoustic Modeling 109
5.1 Audio Descriptor Extraction 110
5.1.1 Pitch. 110
5.1.2 Loudness. 110
5.1.3 MFCC. 110
5.1.4 Spectrals. 110
5.1.5 Formants. 111
5.1.6 Intensity. 111
5.1.7 Others. 111
5.2 Statistic Feature Definition 111
6. Feature Ranking 113
7. Normalization 115
8. Classification 116
8.1 Cross Validation 116
8.2 Evaluation Measurement 117
8.3 Classification Algorithm 118
9. Experiments and Results 118
9.1 Analyzing Feature Distributions 118
9.2 Optimal Feature Sets 120
9.3 Optimal Classification 121
10. Discussion 122
10.0.1 Signal Quality. 122
10.0.2 Speech Length. 122
10.0.3 Speech Transcription. 123
11. Conclusions 123
Acknowledgments 124
References 124
Chapter 5 PSYCHOMIME CLASSIFICATIONAND VISUALIZATIONUSING A SELF-ORGANIZING MAPFOR IMPLEMENTING EMOTIONALSPOKEN DIALOGUE SYSTEM 127
1. Introduction 128
2. Psychomimes and Emotional SpokenDialogue Systems 129
2.1 Onomatopoeias and Psychomimes 129
2.2 Emotional Spoken Dialogue Systems 130
3. Self-Organizing Map 131
3.1 What is SOM? 131
3.2 Natural Language Processing Studies usingSOM 133
4. Experiment 135
4.1 Psychomimes and their Groupings 135
4.2 Corpus 136
4.3 Vector Space 136
4.4 SOM Parameters 138
4.5 Results 139
4.5.1 Determination of Group Areas. 139
4.5.2 Recall and Precision. 141
4.5.3 Effects of Selecting Frames and Combinationsof Frames. 145
4.5.4 Effects of Narrowing Area of Groups. 148
4.5.5 How to Take Advantage of Knowledge. 151
4.5.6 Toward Implementing Emotional Spoken DialogueSystem. 151
5. Conclusions and Future Work 152
References 152
Chapter 6 TRENDS, CHALLENGESAND OPPORTUNITIES IN SPOKENDIALOGUE RESEARCH 155
1. Introduction 155
2. Research in Spoken Dialogue Technology 156
2.1 The Nature of Dialogue Research 156
2.2 Academic and Commercial Research 157
2.3 Three Decades of Research in SpokenDialogue Systems 158
2.4 Application Areas for Dialogue Research 163
3. Challenges for Researchers in SpokenDialogue Systems 163
3.1 Conducting Research in Spoken DialogueSystems 165
3.2 The Availability of Resources for the Designand Development of Spoken DialogueSystems 166
4. Opportunities for Future Research inDialogue 168
4.1 Incorporating Dialogue into Voice Search 168
4.2 Using Dialogue Systems in AmbientIntelligence Environments 170
4.3 CHAT 171
4.4 SmartKom and SmartWeb 172
4.5 TALK 173
4.6 COMPANIONS 174
4.7 Atraco 175
4.8 Summary 175
5. Concluding Remarks 176
Web Pages 177
Notes 178
Chapter 7 DIALOGUE CONTROL BY POMDPUSING DIALOGUE DATA STATISTICS 182
1. Introduction 183
2. Partially Observable Markov DecisionProcess 185
2.1 POMDP Structure 185
2.2 Running Cycle and Value Iteration 187
3. Dialogue Control using POMDP from LargeAmounts of Data 188
3.1 Purpose of Dialogue Control 188
3.2 Automatically Acquiring POMDPParameters and Obtaining a Policy forTarget Dialogues 189
3.3 Reflecting Action Predictive Probabilities inAction Control 192
4. Evaluation and Results 196
5. Discussion 198
6. Future Work 200
7. Conclusions 202
Acknowledgments 203
References 203
Chapter 8 PROPOSAL FOR A PRACTICALSPOKEN DIALOGUE SYSTEMDEVELOPMENT METHOD 206
1. Introduction 206
2. Overview of the Data-Management CenteredPrototyping Method 208
3. Prototyping of a Slot-Filling Dialogue System 210
3.1 Data Model Definition 210
3.2 Controller Script 211
3.3 View Files 212
3.4 Adding a Multi-Modal Interface to a GUIWeb Application 212
3.5 Generating Speech Interaction 213
3.6 Enabling Multi-Modal Interaction 215
3.7 Generation of Dialogue Flow 215
3.8 The Result of the Prototyping 216
4. Prototyping of a DB-Search Dialogue System 217
5. Prototyping of a Multi-Modal InteractivePresentation System 219
5.1 Dialogue Pattern Generation from Metadata 220
5.2 Generation of QA Database 222
5.3 Adaptation of the Language Model 223
5.4 Implementation and Evaluation 224
6. Incorporation of the User Model 226
6.1 User Model in Multi-Modal InteractionSystems 226
6.2 User Model Component of MIML 227
6.3 Functions for User Adaptation 227
7. Conclusions 228
Acknowledgments 229
Notes 229
References 229
Chapter 9 QUALITY OF EXPERIENCINGMULTI-MODAL INTERACTION 231
1. Introduction 231
2. Advantages of Systems ProvidingMulti-Modal Interaction 232
2.1 Modality Relations 233
3. Quality of Experience 234
4. Audio-Video Quality Integration inAV-Transmission Services 235
4.1 Videotelephony 236
4.2 IP-Television 237
5. Quality of Embodied Conversational Agents 239
6. Quality of Systems with Multiple InputModalities 241
6.1 Smart Office 242
6.2 Mobile 243
6.3 Summary 244
7. Conclusions 244
Acknowledgments 246
Notes 246
References 246
Chapter 10 DIALOGUE ACTS ANNOTATIONTO CONSTRUCT DIALOGUE SYSTEMSFOR CONSULTING 249
1. Introduction 249
2. Kyoto Tour Guide Dialogue Corpus 251
3. Annotation of Communicative Function andSemantic Content in DA 254
4. SA Tags 254
4.1 Annotation Unit 254
4.2 Tag Specifications 256
4.2.1 General Layer. 256
4.2.2 Response Layer. 256
4.2.3 Check Layer. 257
4.2.4 Constrain Layer. 257
4.2.5 Action Discussion Layer. 257
4.2.6 Others Layer. 258
4.3 Evaluation of the Annotation 258
4.3.1 Distributional Statistics. 259
4.3.2 Inter-Annotator Agreement. 259
4.3.3 Analysis of the Occurrence Tendency during theProgress of the Episode. 260
4.4 Preliminary Experiment to Estimate SATags via SVM 262
5. Semantic Content Tags 263
5.1 Tag Specifications 264
5.2 Annotation of Semantic Content Tags 265
6. Usage of the Kyoto Tour Guide Corpus 267
6.1 Speech Recognition 267
6.2 Dialogue Management 267
6.3 Speech Synthesis 269
7. Conclusions 270
Notes 270
References 270
Chapter 11 ON THE USE OF N-GRAMTRANSDUCERS FORDIALOGUE ANNOTATION 273
1. Introduction 273
2. The HMM-based Annotation Model 275
3. The NGT Annotation Model 278
4. Corpora 282
4.1 SwitchBoard Corpus 284
4.2 DIHANA Corpus 285
5. Experimental Results 286
6. Conclusions and Future Work 291
Acknowledgments 292
Notes 292
References 292
Index 295

Erscheint lt. Verlag	9.11.2010
Zusatzinfo	XXIII, 277 p.
Verlagsort	New York
Sprache	englisch
Themenwelt	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
Themenwelt	Technik ► Elektrotechnik / Energietechnik
Schlagworte	communicative intention • corpora tools • semantic analysis and modeling • Speech Recognition • spoken multimodality
ISBN-10	1-4419-7934-4 / 1441979344
ISBN-13	978-1-4419-7934-6 / 9781441979346

Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 9,9 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

246,09 €