Crowdsourcing for Speech Processing - Maxine Eskenazi, Gina-Anne Levow, Helen Meng, Gabriel Parent, David Suendermann

Blick ins Buch

Crowdsourcing for Speech Processing (eBook)

Applications to Data Collection, Transcription and Assessment

Maxine Eskenazi, Gina-Anne Levow, Helen Meng, Gabriel Parent, David Suendermann (Autoren)

eBook Download: PDF

2013 | 1. Auflage
360 Seiten
John Wiley & Sons (Verlag)
978-1-118-54127-2 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

Provides an insightful and practical introduction to crowdsourcing as a means of rapidly processing speech data

Intended for those who want to get started in the domain and learn how to set up a task, what interfaces are available, how to assess the work, etc. as well as for those who already have used crowdsourcing and want to create better tasks and obtain better assessments of the work of the crowd. It will include screenshots to show examples of good and poor interfaces; examples of case studies in speech processing tasks, going through the task creation process, reviewing options in the interface, in the choice of medium (MTurk or other) and explaining choices, etc.

* Provides an insightful and practical introduction to crowdsourcing as a means of rapidly processing speech data.

* Addresses important aspects of this new technique that should be mastered before attempting a crowdsourcing application.

* Offers speech researchers the hope that they can spend much less time dealing with the data gathering/annotation bottleneck, leaving them to focus on the scientific issues.

* Readers will directly benefit from the book's successful examples of how crowd- sourcing was implemented for speech processing, discussions of interface and processing choices that worked and choices that didn't, and guidelines on how to play and record speech over the internet, how to design tasks, and how to assess workers.

Essential reading for researchers and practitioners in speech research groups involved in speech processing

Maxine Eskenazi, Carnegie Mellon University, USA Dr. Eskenazi is Principal Systems Scientist at the Language Technologies Institute, Carnegie Mellon University, USA. She has authored over 100 scientific papers in the areas of computer assisted language learning and speech and spoken dialog systems. Her work has produced such systems as the Let's Go spoken dialog system and the REAP vocabulary tutor. She is also the founder and CTO of the Carnegie Speech Company. Gina-Anne Levow, University of Washington, USA Dr. Levow is currently an Assistant Professor in the Department of Linguistics, University of Washington, USA. Prior to joining the faculty at the University of Washington, she served on the faculty at the University of Chicago in the Department of Computer Science and as a Research Fellow at the University of Manchester, UK. She served on the Editorial Board of Computational Linguistics and as Associate Editor of ACM Transactions on Asian Language Processing. Helen Meng, The Chinese University of Hong Kong, Hong Kong Dr. Meng is Founder and Director of the Human-Computer Communications Laboratory at The Chinese University of Hong Kong, and is also the Founder and Co-Director of the Microsoft-CUHK Joint Laboratory for Human-Centric Computing and Interface Technologies, which was conferred the national status of the Ministry of Education of China (MoE) Key Laboratory in 2008. Prof. Meng also served as an Associate Dean (Research) of the Faculty of Engineering from 2006 to 2010. She serves as Editor-in-Chief of the IEEE Transactions on Audio, Speech and Language Processing. Gabriel Parent, Amazon.com, USA Gabriel Parent is a Software Development Engineer at Amazon.com working on solving natural language related problems. His main research focuses were human-computer interaction through spoken dialog systems and crowdsourcing. David Suendermann, Baden-Wuerttemberg Cooperative State University, Germany Dr. Sundermann is currently full Professor of Computer Science at the Baden-Wuerttemberg Cooperative State University, Stuttgart, Germany. He is also the Principal Speech Scientist of SpeechCycle, New York, USA which has been recognized by Deloitte as a "Technology Fast 500" company based on revenue growth. He has authored more than 70 publications and patents, including a book and six book chapters.

Contributors vii

Preface ix

1 An Overview

1.1 Growing Needs for Speech Data

1.1.1 Origins of Crowdsourcing

1.1.2 Operational Definition of Crowdsourcing

1.1.3 Functional Definition of Crowdsourcing

1.2 Some Issues

1.3 Some Terminology

1.4 Acknowledgements

References

2 The Basics

2.1 An Overview of the Literature on Crowdsourcing for Speech Processing

2.1.1 Evolution of the Use of Crowdsourcing for Speech

2.1.2 Geographic Locations of Crowdsourcing for Speech

2.1.3 Specific Areas of Research

2.2 Alternate Solutions

2.3 Some Ready-Made Platforms for Crowdsourcing

2.4 Making Task Creation Easier

2.5 Getting Down to Brass Tacks

2.5.1 Hearing and Being Heard Over the Web

2.5.2 Prequalification

2.5.3 Native Language of the Workers

2.5.4 Payment

2.5.5 Choice of Platform in the Literature

2.5.6 The Complexity of the Task

2.6 Quality Control

2.6.1 Was that Worker a Bot?

2.6.2 Quality Control in the Literature

2.7 Judging the Quality of the Literature

2.8 Some Quick Tips

References

13 Collecting Speech from Crowds

13.1 A Short History of Speech Collection

13.1.1 Speech Corpora

13.1.2 Spoken Language Systems

13.1.3 User-Configured Recording Environments

13.2 Technology for Web-based Audio Collection

13.2.1 Silverlight

13.2.2 Java

13.2.3 Flash

13.2.4 HTML and JavaScript

13.3 Example:WAMI Recorder

13.3.1 The JavaScript API

13.3.2 Audio Formats

13.4 Example: The WAMI Server

13.4.1 PHP Script

13.4.2 Google App Engine

13.4.3 Server Configuration Details

13.5 Example: Speech Collection on Amazon Mechanical Turk

13.5.1 Server Setup

13.5.2 Deploying to Amazon Mechanical Turk

13.5.3 The Command Line Interface

13.6 Using the Platform Purely for Payment

13.7 Advanced Methods of Crowdsourced Audio Collection

13.7.1 Collecting Dialogue Interactions

13.7.2 Human Computation

13.8 Summary

13.9 Acknowledgements

References

Index

Erscheint lt. Verlag	6.2.2013
Sprache	englisch
Themenwelt	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
Themenwelt	Technik ► Elektrotechnik / Energietechnik
Schlagworte	Audio & Speech Processing & Broadcasting • Audio-, Sprachverarbeitung u. Ãbertragung • Audio-, Sprachverarbeitung u. Übertragung • Computer Science • Crowdsourcing • Database & Data Warehousing Technologies • Datenbanken u. Data Warehousing • Electrical & Electronics Engineering • Elektrotechnik u. Elektronik • Informatik • Sprachverarbeitung
ISBN-10	1-118-54127-8 / 1118541278
ISBN-13	978-1-118-54127-2 / 9781118541272

Haben Sie eine Frage zum Produkt?

PDF (Adobe DRM)
Größe: 24,9 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.