Privacy-Preserving Machine Learning for Speech Processing (eBook)

eBook Download: PDF
2012 | 2013
XVIII, 142 Seiten
Springer New York (Verlag)
978-1-4614-4639-2 (ISBN)

Lese- und Medienproben

Privacy-Preserving Machine Learning for Speech Processing -  Manas A. Pathak
Systemvoraussetzungen
96,29 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
This thesis discusses the privacy issues in speech-based applications such as biometric authentication, surveillance, and external speech processing services. Author Manas A. Pathak presents solutions for privacy-preserving speech processing applications such as speaker verification, speaker identification and speech recognition. The author also introduces some of the tools from cryptography and machine learning and current techniques for improving the efficiency and scalability of the presented solutions. Experiments with prototype implementations of the solutions for execution time and accuracy on standardized speech datasets are also included in the text. Using the framework proposed  may now make it possible for a surveillance agency to listen for a known terrorist without being able to hear conversation from non-targeted, innocent civilians.

Dr. Manas A. Pathak received the BTech degree in computer science from Visvesvaraya National Institute of Technology, Nagpur, India, in 2006, and the MS and PhD degrees from the Language Technologies Institute at Carnegie Mellon University (CMU) in 2009 and 2012 respectively. He is currently working as a research scientist at Adchemy, Inc. His research interests include intersection of data privacy, machine learning, speech processing.


This thesis discusses the privacy issues in speech-based applications such as biometric authentication, surveillance, and external speech processing services. Author Manas A. Pathak presents solutions for privacy-preserving speech processing applications such as speaker verification, speaker identification and speech recognition. The author also introduces some of the tools from cryptography and machine learning and current techniques for improving the efficiency and scalability of the presented solutions. Experiments with prototype implementations of the solutions for execution time and accuracy on standardized speech datasets are also included in the text. Using the framework proposed may now make it possible for a surveillance agency to listen for a known terrorist without being able to hear conversation from non-targeted, innocent civilians.

Dr. Manas A. Pathak received the BTech degree in computer science from Visvesvaraya National Institute of Technology, Nagpur, India, in 2006, and the MS and PhD degrees from the Language Technologies Institute at Carnegie Mellon University (CMU) in 2009 and 2012 respectively. He is currently working as a research scientist at Adchemy, Inc. His research interests include intersection of data privacy, machine learning, speech processing.

Privacy-PreservingMachine Learningfor Speech Processing 3
Supervisor’s Foreword 6
Acknowledgments 8
Contents 10
Acronyms 15
Part I 
16 
1 Thesis Overview 17
1.1 Motivation 17
1.2 Thesis Statement 18
1.3 Summary of Contributions 19
1.4 Thesis Organization 20
References 20
2 Speech Processing Background 21
2.1 Tools and Techniques 21
2.1.1 Signal Parameterization 21
2.1.2 Gaussian Mixture Models 22
2.1.3 Hidden Markov Models 22
2.2 Speaker Identification and Verification 24
2.2.1 Modeling Speech 24
2.2.2 Model Adaptation 26
2.2.3 Supervectors with LSH 27
2.2.4 Reconstructing Data from LSH Keys 29
2.3 Speech Recognition 30
References 31
3 Privacy Background 33
3.1 What is Privacy? 33
3.1.1 Definitions 33
3.1.2 Related Concepts 34
3.1.3 Privacy-Preserving Applications 35
3.1.4 Privacy-Preserving Computation in this Thesis 36
3.2 Secure Multiparty Computation 36
3.2.1 Protocol Assumptions 38
3.2.2 Adversarial Behavior 39
3.2.3 Privacy Definitions: Ideal Model and Real Model 40
3.2.4 Encryption 41
3.2.5 Masking 47
3.2.6 Zero-Knowledge Proofs and Threshold Cryptosystems 49
3.2.7 Oblivious Transfer 51
3.2.8 Related Work on SMC Protocols for Machine Learning 53
3.3 Differential Privacy 53
3.3.1 Exponential Mechanism 55
3.3.2 Related Work on Differentially Private Machine Learning 56
3.3.3 Differentially Private Speech Processing 56
References 57
Part II 
60 
4 Overview of Speaker Verification with Privacy 61
4.1 Introduction 61
4.2 Privacy Issues and Adversarial Behavior 62
4.2.1 Imposter Imitating a User 63
4.2.2 Collusion 64
4.2.3 Information Leakage After Multiple Interactions 64
References 65
5 Privacy-Preserving Speaker Verification Using Gaussian Mixture Models 66
5.1 System Architecture 66
5.2 Speaker Verification Protocols 68
5.2.1 Private Enrollment Protocol 69
5.2.2 Private Verification Protocols 69
5.3 Experiments 71
5.3.1 Precision 72
5.3.2 Accuracy 72
5.3.3 Execution Time 72
5.4 Conclusion 73
5.5 Supplementary Protocols 74
References 77
6 Privacy-Preserving Speaker Verification as String Comparison 78
6.1 System Architecture 79
6.2 Protocols 80
6.3 Experiments 81
6.3.1 Accuracy 81
6.3.2 Execution Time 82
6.4 Conclusion 83
References 83
Part III Privacy-Preserving Speaker Identification 84
7 Overview of Speaker Identification with Privacy 85
7.1 Introduction 85
7.1.1 Speech-Based Surveillance 85
7.1.2 Preliminary Step for Other Speech Processing Tasks 86
7.2 Privacy Issues and Adversarial Behavior 87
7.2.1 Collusion 88
7.2.2 Information Leakage After Multiple Interactions 88
8 Privacy-Preserving Speaker Identification Using Gaussian Mixture Models 89
8.1 Introduction 89
8.2 System Architecture 90
8.3 Speaker Identification Protocols 91
8.3.1 Case 1: Client Sends Encrypted Speech Sample to the Server 91
8.3.2 Case 2: Server Sends Encrypted Speaker Models to the Client 93
8.4 Experiments 95
8.4.1 Precision 95
8.4.2 Accuracy 95
8.4.3 Execution Time 95
8.5 Conclusion 96
References 96
9 Privacy-Preserving Speaker Identification as String Comparison 98
9.1 Introduction 98
9.2 System Architecture 99
9.3 Protocols 100
9.3.1 Oblivious Salting 100
9.3.2 Speaker Identification 101
9.4 Experiments 102
9.4.1 Accuracy 102
9.4.2 Execution Time 103
9.5 Conclusion 104
References 104
Part IV Privacy-Preserving Speech Recognition 105
10 Overview of Speech Recognition with Privacy 106
10.1 Introduction 106
10.2 Client-Server Model for Speech Recognition 106
10.3 Privacy Issues 107
10.4 System Architecture 108
Reference 109
11 Privacy-Preserving Isolated-Word Recognition 110
11.1 Introduction 110
11.2 Protocol for Secure Forward Algorithm 111
11.2.1 Secure Logarithm Protocol 111
11.2.2 Secure Exponent Protocol 111
11.2.3 Secure Logsum Protocol 112
11.2.4 Secure Forward Algorithm Protocol 112
11.2.5 Security Analysis 113
11.3 Privacy-Preserving Isolated-Word Recognition 113
11.3.1 Simplified Secure Forward Algorithm 113
11.3.2 Protocol for Privacy-Preserving Isolated-Word Recognition 114
11.3.3 Computational Complexity 114
11.3.4 Practical Issues 115
11.3.5 Experiments 115
11.4 Discussion 116
References 116
Part V Conclusion 117
12 Thesis Conclusion 118
12.1 Summary of Results 118
12.2 Discussion 120
13 Future Work 122
13.1 Other Privacy-Preserving Speech Processing Tasks 122
13.1.1 Privacy Preserving Music Recognition and Keyword Spotting 122
13.1.2 Privacy Preserving Graph Search for Continuous Speech Recognition 123
13.2 Algorithmic Improvements 123
13.2.1 Ensemble of LSH Functions 123
13.2.2 Using Fully Homomorphic Encryption 123
References 124
Differentially Private Gaussian Mixture Models 125
A.1 Introduction 125
A.2 Large Margin Gaussian Classifiers 126
A.2.1 Modeling Single Gaussian per Class 126
A.2.2 Generalizing to Mixtures of Gaussians 127
A.2.3 Making the Objective Function Differentiable and Strongly Convex 128
A.3 Differentially Private Large Margin Gaussian Mixture Models 130
A.4 Theoretical Analysis 131
A.4.1 Proof of Differential Privacy 131
A.4.2 Analysis of Excess Error 133
A.5 Experiments 137
A.6 Conclusion 138
A.7 Supplementary Proofs 138
References 142
Author Biography 144

Erscheint lt. Verlag 26.10.2012
Reihe/Serie Springer Theses
Zusatzinfo XVIII, 142 p.
Verlagsort New York
Sprache englisch
Themenwelt Informatik Theorie / Studium Künstliche Intelligenz / Robotik
Technik Elektrotechnik / Energietechnik
Technik Maschinenbau
Technik Nachrichtentechnik
Schlagworte homomorphic encryption • Locality Sensitive Hashing • Secure Multiparty Computation • speaker identification • speaker verification • Speech Recognition
ISBN-10 1-4614-4639-2 / 1461446392
ISBN-13 978-1-4614-4639-2 / 9781461446392
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 3,8 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
der Praxis-Guide für Künstliche Intelligenz in Unternehmen - Chancen …

von Thomas R. Köhler; Julia Finkeissen

eBook Download (2024)
Campus Verlag
38,99
Wie du KI richtig nutzt - schreiben, recherchieren, Bilder erstellen, …

von Rainer Hattenhauer

eBook Download (2023)
Rheinwerk Computing (Verlag)
24,90