Application of Wavelets in Speech Processing (eBook)
XIV, 86 Seiten
Springer International Publishing (Verlag)
978-3-319-69002-5 (ISBN)
This new edition provides an updated and enhanced survey on employing wavelets analysis in an array of applications of speech processing. The author presents updated developments in topics such as; speech enhancement, noise suppression, spectral analysis of speech signal, speech quality assessment, speech recognition, forensics by Speech, and emotion recognition from speech. The new edition also features a new chapter on scalogram analysis of speech.
Moreover, in this edition, each chapter is restructured as such; that it becomes self contained, and can be read separately. Each chapter surveys the literature in a topic such that the use of wavelets in the work is explained and experimental results of proposed method are then discussed. Illustrative figures are also added to explain the methodology of each work.
Mohamed Hesham Farouk El-Sayed is a full professor with the Engineering Math & Physics Department within the Faculty of Engineering at Cairo University. He obtained his B.Sc. in electronics and telecommunications engineering with honors on 1982, another B.Sc. in physics on 1985 and M.Sc. in engineering physics on 1989 all from Cairo university. He received his Ph.D. in Engineering Physics from Cairo University on 1993. He is the author and coauthor of several published papers on the application of wavelets in the analysis of speech and on modeling of speech production in reputable periodicals and conferences since 1993. He is also the author of Application of Wavelets in Speech Processing (Springer 2014). M. Hesham has been actively involved in several national R&D projects on speech recognition since 1982.
Mohamed Hesham Farouk El-Sayed is a full professor with the Engineering Math & Physics Department within the Faculty of Engineering at Cairo University. He obtained his B.Sc. in electronics and telecommunications engineering with honors on 1982, another B.Sc. in physics on 1985 and M.Sc. in engineering physics on 1989 all from Cairo university. He received his Ph.D. in Engineering Physics from Cairo University on 1993. He is the author and coauthor of several published papers on the application of wavelets in the analysis of speech and on modeling of speech production in reputable periodicals and conferences since 1993. He is also the author of Application of Wavelets in Speech Processing (Springer 2014). M. Hesham has been actively involved in several national R&D projects on speech recognition since 1982.
Dedication 6
Preface 7
Organization of the Book 7
Acknowledgment 8
Abbreviations 12
Contents 9
Chapter 1: Introduction 14
1.1 History and Definition of Speech Processing 14
1.2 Applications of Speech Processing 15
1.3 Recent Progress in Speech Processing 15
1.4 Wavelet Analysis as an Efficient Tool for Speech Processing 16
References 17
Chapter 2: Speech Production and Perception 18
2.1 Speech Production Process 18
2.2 Classification of Speech Sounds 19
2.3 Speech Production Modeling 20
2.4 Speech Perception Modeling 21
2.5 Intelligibility and Speech Quality Measures 22
References 23
Chapter 3: Wavelets, Wavelet Filters, and Wavelet Transforms 24
3.1 Short-Time Fourier Transform (STFT) 24
3.2 Multiresolution Analysis and Wavelet Transform 25
3.3 Wavelets and Bank of Filters 27
3.4 Wavelet Families 28
3.5 Wavelet Packets 29
3.6 Undecimated Wavelet Transform 31
3.7 The Continuous Wavelet Transform (CWT) 31
3.8 Wavelet Scalogram 32
3.9 Empirical Wavelets 32
References 33
Chapter 4: Spectral Analysis of Speech Signal and Pitch Estimation 35
4.1 Spectral Analysis 35
4.2 Formant Tracking and Estimation 36
4.3 Pitch Estimation 37
References 39
Chapter 5: Speech Detection and Separation 41
5.1 Voice Activity Detection 41
5.2 Segmentation of Speech Signal 42
5.3 Source Separation of Speech 43
References 45
Chapter 6: Speech Enhancement and Noise Suppression 46
6.1 Thresholding Schemes 47
6.2 Thresholding on Wavelet Packet Coefficients 48
6.3 Enhancement on Multitaper Spectrum 49
References 50
Chapter 7: Speech Recognition 52
7.1 Signal Enhancement and Noise Cancellation for Robust Recognition 52
7.2 Wavelet-Based Features for Better Recognition 53
7.3 Hybrid Approach 54
7.4 Wavelet as an Activation Function for Neural Networks in ASR 55
References 56
Chapter 8: Speaker Identification 58
8.1 Wavelet-Based Features for Speaker Identification 59
8.2 Hybrid Feature Sets for Speaker Identification 60
References 60
Chapter 9: Emotion Recognition from Speech 62
9.1 Wavelet-Based Features for Emotion Recognition 62
9.2 Combined Feature Set for Better Emotion Recognition 64
9.3 WNN for Emotion Recognition 65
References 65
Chapter 10: Speech Coding, Synthesis, and Compression 67
10.1 Speech Synthesis 67
10.2 Speech Coding and Compression 68
10.3 Real-Time Implementation of DWT-Based Speech Compression 68
References 69
Chapter 11: Speech Quality Assessment 71
11.1 Wavelet-Packet Analysis 71
11.2 Discrete Wavelet Transform 73
References 74
Chapter 12: Scalogram and Nonlinear Analysis of Speech 75
12.1 Wavelet-Based Nonlinear Features 75
12.2 Wavelet Scalogram Analysis 76
12.3 Nonlinear and Chaotic Components in Speech Signal 77
References 79
Chapter 13: Steganography, Forensics, and Security of Speech Signal 81
13.1 Secure Communication of Speech 81
13.2 Watermarking of Speech 83
13.3 Watermarking in Sparse Representation 83
13.4 Forensic Analysis of Speech 84
References 85
Chapter 14: Clinical Diagnosis and Assessment of Speech Pathology 87
References 89
Index 91
Erscheint lt. Verlag | 29.11.2017 |
---|---|
Reihe/Serie | SpringerBriefs in Speech Technology | SpringerBriefs in Speech Technology |
Zusatzinfo | XIV, 86 p. 25 illus., 12 illus. in color. |
Verlagsort | Cham |
Sprache | englisch |
Themenwelt | Mathematik / Informatik ► Mathematik |
Technik ► Elektrotechnik / Energietechnik | |
Technik ► Nachrichtentechnik | |
Schlagworte | multiresolution analysis • Short Time Fourier Transform (STFT) • Speech Analysis • Speech coding • Speech processing • Speech Production Modeling • Speech Quality Measures • Speech Recognition • Wavelet Families • Wavelet Packets • Wavelets and Bank of Filters • wavelet transform |
ISBN-10 | 3-319-69002-7 / 3319690027 |
ISBN-13 | 978-3-319-69002-5 / 9783319690025 |
Haben Sie eine Frage zum Produkt? |
Größe: 2,8 MB
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich