Analyzing Emotion in Spontaneous Speech - Rupayan Chakraborty, Meghna Pandharipande, Sunil Kumar Kopparapu

Analyzing Emotion in Spontaneous Speech (eBook)

Rupayan Chakraborty, Meghna Pandharipande, Sunil Kumar Kopparapu (Autoren)

eBook Download: PDF

2018 | 1st ed. 2017
XVI, 81 Seiten
Springer Singapore (Verlag)
978-981-10-7674-9 (ISBN)

This book captures the current challenges in automatic recognition of emotion in spontaneous speech and makes an effort to explain, elaborate, and propose possible solutions. Intelligent human-computer interaction (iHCI) systems thrive on several technologies like automatic speech recognition (ASR); speaker identification; language identification; image and video recognition; affect/mood/emotion analysis; and recognition, to name a few. Given the importance of spontaneity in any human-machine conversational speech, reliable recognition of emotion from naturally spoken spontaneous speech is crucial. While emotions, when explicitly demonstrated by an actor, are easy for a machine to recognize, the same is not true in the case of day-to-day, naturally spoken spontaneous speech. The book explores several reasons behind this, but one of the main reasons for this is that people, especially non-actors, do not explicitly demonstrate their emotion when they speak, thus making it difficult for machines to distinguish one emotion from another that is embedded in their spoken speech. This short book, based on some of authors' previously published books, in the area of audio emotion analysis, identifies the practical challenges in analysing emotions in spontaneous speech and puts forward several possible solutions that can assist in robustly determining the emotions expressed in spontaneous speech.

Rupayan Chakraborty (Member IEEE) works as a scientist at the TCS Research and Innovation - Mumbai. He has been working in the area of speech and audio signal processing/recognition since 2008 and was involved in academic research prior to joining TCS. He worked as a researcher at the Computer Vision and Pattern Recognition (CVPR) Unit of the Indian Statistical Institute (ISI), Kolkata. He obtained his PhD degree from TALP Research Centre, Barcelona, Spain, in December 2013, in the area of acoustic event detection and localization using distributed sensor arrays in room environments, while working on the 'Speech and Audio Recognition for Ambient Intelligence (SARAI)' project. After completing his PhD, he was a visiting scientist at the CVPR Unit of ISI, Kolkata, for 1 year. He has published research work in top-tier conferences and journals. He is currently working in the area of 'speech emotion recognition and analysis'.

Meghna Pandharipande received her Bachelor of Engineering (BE) in Electronics and Telecommunication in June 2002 from Amravati University, Amravati. Between September 2002 and December 2003, she was a faculty member of the Department of Electronics and Telecommunication at Shah and Anchor Kutchhi Engineering College, Mumbai. In 2004, she completed her certification in Embedded Systems at CMC, Mumbai and then worked as a Lotus Notes developer in a startup ATS, Mumbai for a year. Since June 2005 she has been with TCS (having first joined the Cognitive Systems Research Laboratory, Tata InfoTech under Prof. P.V.S. Rao) and since 2006, she has been working as a researcher at TCS Research and Innovation - Mumbai. Her research interest is in the area of speech signal processing and has been working extensively on building systems that can process all aspects of spoken speech. More recently, she has been researching non-linguistic aspects of speech processing, like speaking rate and emotion detection from speech.

Sunil Kumar Kopparapu (Senior Member, IEEE; ACM Senior Member India) obtained his doctoral degree in Electrical Engineering from the Indian Institute of Technology Bombay, Mumbai, India in 1997. His thesis 'Modular integration for low-level and high-level vision problems in a multi-resolution framework' provided a broad framework to enable reliable and fast vision processing.

Between 1997 and 2000 he was with the Automation Group, Commonwealth Scientific and Industrial Research Organization (CSIRO), Brisbane, Australia working on practical image processing and 3D vision problems, mainly for the benefit of the Australian mining industry.

Prior to joining the Cognitive Systems Research Laboratory (CSRL), Tata InfoTech Limited as a senior research member in 2001, he was an expert for developing virtual self-line of e-commerce products for the R&D Group at Aquila Technologies Private Limited, India.

In his current role as a principal scientist at TCS Research and Innovation - Mumbai, he is actively working in the areas of speech, script, image and natural-language processing with a focus on building usable systems for mass use in Indian conditions. He has co-authored a book titled 'Bayesian Approach to Image Interpretation' and more recently a Springer Brief on Non-linguistic Analysis of Call Center Conversations.

Rupayan Chakraborty (Member IEEE) works as a scientist at the TCS Research and Innovation - Mumbai. He has been working in the area of speech and audio signal processing/recognition since 2008 and was involved in academic research prior to joining TCS. He worked as a researcher at the Computer Vision and Pattern Recognition (CVPR) Unit of the Indian Statistical Institute (ISI), Kolkata. He obtained his PhD degree from TALP Research Centre, Barcelona, Spain, in December 2013, in the area of acoustic event detection and localization using distributed sensor arrays in room environments, while working on the “Speech and Audio Recognition for Ambient Intelligence (SARAI)” project. After completing his PhD, he was a visiting scientist at the CVPR Unit of ISI, Kolkata, for 1 year. He has published research work in top-tier conferences and journals. He is currently working in the area of “speech emotion recognition and analysis”.Meghna Pandharipande received her Bachelor of Engineering (BE) in Electronics and Telecommunication in June 2002 from Amravati University, Amravati. Between September 2002 and December 2003, she was a faculty member of the Department of Electronics and Telecommunication at Shah and Anchor Kutchhi Engineering College, Mumbai. In 2004, she completed her certification in Embedded Systems at CMC, Mumbai and then worked as a Lotus Notes developer in a startup ATS, Mumbai for a year. Since June 2005 she has been with TCS (having first joined the Cognitive Systems Research Laboratory, Tata InfoTech under Prof. P.V.S. Rao) and since 2006, she has been working as a researcher at TCS Research and Innovation - Mumbai. Her research interest is in the area of speech signal processing and has been working extensively on building systems that can process all aspects of spoken speech. More recently, she has been researching non-linguistic aspects of speech processing, like speaking rate and emotion detection from speech.Sunil Kumar Kopparapu (Senior Member, IEEE; ACM Senior Member India) obtained his doctoral degree in Electrical Engineering from the Indian Institute of Technology Bombay, Mumbai, India in 1997. His thesis “Modular integration for low-level and high-level vision problems in a multi-resolution framework” provided a broad framework to enable reliable and fast vision processing.Between 1997 and 2000 he was with the Automation Group, Commonwealth Scientific and Industrial Research Organization (CSIRO), Brisbane, Australia working on practical image processing and 3D vision problems, mainly for the benefit of the Australian mining industry.Prior to joining the Cognitive Systems Research Laboratory (CSRL), Tata InfoTech Limited as a senior research member in 2001, he was an expert for developing virtual self-line of e-commerce products for the R&D Group at Aquila Technologies Private Limited, India.In his current role as a principal scientist at TCS Research and Innovation - Mumbai, he is actively working in the areas of speech, script, image and natural-language processing with a focus on building usable systems for mass use in Indian conditions. He has co-authored a book titled “Bayesian Approach to Image Interpretation” and more recently a Springer Brief on Non-linguistic Analysis of Call Center Conversations.

Introduction.- Literature Survey.- A Framework for Spontaneous Speech Emotion Recognition.- Improving Emotion Classiﬁcation Accuracies.- Case Studies.- Conclusions.- Appendix.- Index.

Erscheint lt. Verlag	23.1.2018
Zusatzinfo	XVI, 81 p. 31 illus.
Verlagsort	Singapore
Sprache	englisch
Themenwelt	Informatik ► Grafik / Design ► Digitale Bildverarbeitung
	Informatik ► Software Entwicklung ► User Interfaces (HCI)
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
	Technik ► Elektrotechnik / Energietechnik
Schlagworte	call center coversations analysis • emotion recognition • framework for emotion classification • real time emotion • spontaneous speech processing
ISBN-10	981-10-7674-X / 981107674X
ISBN-13	978-981-10-7674-9 / 9789811076749

Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 2,6 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

58,84 €