Blick ins Buch

Emerging Technology and Architecture for Big-data Analytics (eBook)

Anupam Chattopadhyay, Chip Hong Chang, Hao Yu (Herausgeber)

eBook Download: PDF

2017 | 1st ed. 2017
XI, 330 Seiten
Springer International Publishing (Verlag)
978-3-319-54840-1 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

This book describes the current state of the art in big-data analytics, from a technology and hardware architecture perspective. The presentation is designed to be accessible to a broad audience, with general knowledge of hardware design and some interest in big-data analytics. Coverage includes emerging technology and devices for data-analytics, circuit design for data-analytics, and architecture and algorithms to support data-analytics. Readers will benefit from the realistic context used by the authors, which demonstrates what works, what doesn't work, and what are the fundamental problems, solutions, upcoming challenges and opportunities.

Provides a single-source reference to hardware architectures for big-data analytics;
Covers various levels of big-data analytics hardware design abstraction and flow, from device, to circuits and systems;
Demonstrates how non-volatile memory (NVM) based hardware platforms can be a viable solution to existing challenges in hardware architecture for big-data analytics.

Chang Chip Hong received his B.Eng. (Hons) from National University of Singapore in 1989, and his M.Eng. and Ph.D. from the School of Electrical and Electronic Engineering of Nanyang Technological University, Singapore in 1993 and 1998, respectively. Since 1999, he has been with the School of Electrical and Electronic Engineering, Nanyang Technological University where he is currently an Associate Professor. He holds concurrent appointments at the university as the Assistant Chair (Alumni) of the School of EEE since June 2008, Deputy Director of the Centre for High Performance Embedded Systems (CHiPES) since 2000, and the Program Director of the VLSI Design and Embedded Systems research group of the Centre for Integrated Circuits and Systems (CICS) since 2003. He has published three book chapters and more than 140 refereed international journal and conference papers. He is an Associate Editor of the IEEE Transactions on Circuits and Systems I: Regular Papers from 2010-2011, an Editorial Advisory Board Member of the Open Electrical and Electronic Engineering Journal since 2007, an Editorial Board Member of the Journal of Electrical and Computer Engineering since 2008, and a technical reviewer for several prestigious international journals. He is appointed the Charter Fellow of Advisory Directorate International by the American Biographical Institute, Inc. (ABI) and listed in the Marquis Who's Who in the World since 2008. He is a Senior Member of the IEEE and a Fellow of IET.

Hao Yu obtained his B.S. degree from Fudan University (Shanghai China) in 1999, with 4-year first-prize Guanghua scholarship (top-2) and 1-year Samsung scholarship for the outstanding student in science and engineering (top-1). After selected by mini-cuspea program, he spent some time in New York University, and obtained M.S/Ph. D degrees both from electrical engineering department at UCLA in 2007, with major of integrated circuit and embedded computing. He was a senior research staff at Berkeley Design Automation (BDA) since 2006, one of top-100 start-ups selected by Red-herrings at Silicon Valley. Since October 2009, he is an assistant professor at school of electrical and electronic engineering, and also as area director of VIRTUS/VALENS Centre of Excellence, Nanyang Technological University (NTU), Singapore.

Anupam Chattopadhyay received his B.E. degree from Jadavpur University, India in 2000. He received his MSc. from ALaRI, Switzerland and PhD from RWTH Aachen in 2002 and 2008 respectively. From 2008 to 2009, he worked as a Member of Consulting Staff in CoWare R&D, Noida, India. From 2010 to 2014, he led the MPSoC Architectures Research Group in RWTH Aachen, Germany as a Junior Professor. Since September, 2014, he is appointed as an assistant Professor in SCE, NTU. During his PhD, he worked on automatic RTL generation from the architecture description language LISA, which was commercialized later by a leading EDA vendor. He developed several high-level optimizations and verification flow for embedded processors. In his doctoral thesis, he proposed a language-based modeling, exploration and implementation framework for partially re-configurable processors. Together with his doctoral students, he proposed domain-specific high-level synthesis for cryptography, high-level reliability estimation flows, generalization of classic linear algebra kernels and a novel multi-layered coarse-grained reconfigurable architecture. In these areas, he published as a (co)-author over 80 conference/ journal papers, several book-chapters and a book. Anupam served in several TPCs of top conferences, regularly reviews journal/ conference articles and presented multiple invited seminars/tutorials in prestigious venues. He is a member of ACM and a senior member of IEEE.Chang Chip Hong received his B.Eng. (Hons) from National University of Singapore in 1989, and his M.Eng. and Ph.D. from the School of Electrical and Electronic Engineering of Nanyang Technological University, Singapore in 1993 and 1998, respectively. Since 1999, he has been with the School of Electrical and Electronic Engineering, Nanyang Technological University where he is currently an Associate Professor. He holds concurrent appointments at the university as the Assistant Chair (Alumni) of the School of EEE since June 2008, Deputy Director of the Centre for High Performance Embedded Systems (CHiPES) since 2000, and the Program Director of the VLSI Design and Embedded Systems research group of the Centre for Integrated Circuits and Systems (CICS) since 2003. He has published three book chapters and more than 140 refereed international journal and conference papers. He is an Associate Editor of the IEEE Transactions on Circuits and Systems I: Regular Papers from 2010-2011, an Editorial Advisory Board Member of the Open Electrical and Electronic Engineering Journal since 2007, an Editorial Board Member of the Journal of Electrical and Computer Engineering since 2008, and a technical reviewer for several prestigious international journals. He is appointed the Charter Fellow of Advisory Directorate International by the American Biographical Institute, Inc. (ABI) and listed in the Marquis Who's Who in the World since 2008. He is a Senior Member of the IEEE and a Fellow of IET.Hao Yu obtained his B.S. degree from Fudan University (Shanghai China) in 1999, with 4-year first-prize Guanghua scholarship (top-2) and 1-year Samsung scholarship for the outstanding student in science and engineering (top-1). After selected by mini-cuspea program, he spent some time in New York University, and obtained M.S/Ph. D degrees both from electrical engineering department at UCLA in 2007, with major of integrated circuit and embedded computing. He was a senior research staff at Berkeley Design Automation (BDA) since 2006, one of top-100 start-ups selected by Red-herrings at Silicon Valley. Since October 2009, he is an assistant professor at school of electrical and electronic engineering, and also as area director of VIRTUS/VALENS Centre of Excellence, Nanyang Technological University (NTU), Singapore.

Preface 5
Contents 7
About the Editors 9
Part I State-of-the-Art Architectures and Automation for Data-Analytics 12
1 Scaling the Java Virtual Machine on a Many-Core System 13
1.1 Introduction 13
1.2 Background 17
1.2.1 Workload Selection 18
1.2.2 Performance Analysis Tools 19
1.2.3 Experimental Setup 22
1.3 Thread-Local Data Objects 25
1.4 Memory Allocators 26
1.5 Java Concurrency API 28
1.6 Garbage Collection 29
1.7 Non-uniform Memory Access (NUMA) 30
1.8 Conclusion and Future Directions 32
Appendix 32
References 33
2 Accelerating Data Analytics Kernels with HeterogeneousComputing 35
2.1 Introduction 35
2.2 Motivation 38
2.3 Automated Design Space Exploration Flow 40
2.3.1 The Lin-Analyzer Framework 40
2.3.2 Framework Overview 41
2.3.3 Instrumentation 42
2.3.4 Optimized DDDG Generation 42
2.3.4.1 Sub-trace Extraction 43
2.3.4.2 DDDG Generation & Pre-optimizations
2.3.5 DDDG Scheduling 44
2.3.6 Enabling Design Space Exploration 45
2.4 Acceleration of Data Analytics Kernels 50
2.4.1 Estimation Accuracy 51
2.4.1.1 Loop Unrolling and Loop Pipelining 51
2.4.1.2 Array Partitioning 52
2.4.2 Rapid Design Space Exploration 53
2.5 Conclusion 56
References 57
3 Least-squares-solver Based Machine Learning Acceleratorfor Real-time Data Analytics in Smart Buildings 60
3.1 Introduction 60
3.2 IoT System Based Smart Building 62
3.2.1 Smart-Grid Architecture 62
3.2.2 Smart Gateway for Real-Time Data Analytics 62
3.2.3 Problem Formulation for Data Analytics 63
3.3 Background on Neural Network Based Machine Learning 63
3.3.1 Backward Propagation for Training 64
3.3.2 Least-Squares Solver for Training 66
3.3.3 Feature Extraction with Behavior Cognition 66
3.4 Least-Squares Solver Based Training Algorithm 68
3.4.1 Regularized 2-Norm 68
3.4.2 Square-Root-Free Cholesky Decomposition 69
3.4.3 Incremental Least-Squares Solution 70
3.5 Least-Squares Based Machine Learning Accelerator Architecture 71
3.5.1 Overview of Computing Flow and Communication 71
3.5.2 FPGA Accelerator Architecture 73
3.5.3 2-Norm Solver 73
3.5.4 Matrix–Vector Multiplication 75
3.6 Experiment Results 75
3.6.1 Experiment Setup and Benchmark 75
3.6.2 FPGA Design Platform and CAD Flow 77
3.6.3 Scalable and Parameterized Accelerator Architecture 78
3.6.4 Performance for Data Classification 81
3.6.5 Performance for Load Forecasting 81
3.6.6 Performance Comparisons with Other Platforms 82
3.7 Conclusion 83
References 84
4 Compute-in-Memory Architecture for Data-Intensive Kernels 86
4.1 Introduction 86
4.2 Malleable Hardware Acceleration 88
4.2.1 Hardware Architecture 88
4.2.2 Application Mapping 90
4.2.2.1 Application Description Using an Instruction Set Architecture 90
4.2.2.2 Application Mapping to the General Framework 92
4.2.3 Domain Customization for Efficient Acceleration 92
4.3 Case Studies for Memory-Centric Computing 93
4.3.1 MAHA for Security Applications 93
4.3.1.1 Domain Exploration 94
4.3.1.2 Architecture Description 94
4.3.1.3 Results and Comparison to Other Platforms 96
4.3.2 MAHA for Text Mining Applications 97
4.3.2.1 Domain Exploration 98
4.3.2.2 Architecture Description 99
4.3.2.3 Results and Comparison to Other Platforms 101
4.4 Case Studies for In-Memory Computing 101
4.4.1 Flash-Based MAHA 102
4.4.1.1 Domain Exploration 102
4.4.1.2 Architecture Description 104
4.4.1.3 Results and Comparison to Other Platform 106
4.4.2 MultiFunctional Memory 107
4.4.2.1 Architecture Description 107
4.4.2.2 Results and Comparison to Other Platforms 109
4.5 Conclusion 109
References 110
5 New Solutions for Cross-Layer System-Level and High-LevelSynthesis 111
5.1 Introduction 111
5.2 ESL Design Flow Challenges 113
5.3 System-/High-Level Synthesis Techniques 116
5.3.1 Polyhedral Transformation to Improve HLS Optimization Opportunity 117
5.3.1.1 Step 1 118
5.3.1.2 Step 2 119
5.3.1.3 Step 3 121
5.3.1.4 Evaluation 121
5.3.2 Polyhedral Code Generation for High-Level Synthesis 122
5.3.2.1 Turning Off Polyhedra Separation 123
5.3.2.2 Division Optimization 124
5.3.2.3 Hierarchical Min/Max Operations 125
5.3.2.4 Loop-Tiling Bound Simplification 125
5.3.2.5 Experimental Results 127
5.3.3 Multi-Cycle Path Analysis for High-Level Synthesis 128
5.3.3.1 Circuit States and Control-States 129
5.3.3.2 Capturing Conditional Behavior in the STG 130
5.3.3.3 Data Dependency Analysis 131
5.3.3.4 Available Cycles Calculation 132
5.3.3.5 Multi-cycle Constraints Generation 132
5.3.3.6 Evaluation 132
5.3.4 Layout-Driven High-Level Synthesis for FPGAs 133
5.3.4.1 Component Pre-characterization 135
5.3.4.2 Initialization Stage 135
5.3.4.3 Iteration Stage 136
5.3.4.4 Evaluation 138
5.4 Conclusion 140
References 140
Part II Approaches and Applications for Data Analytics 143
6 Side Channel Attacks and Their Low Overhead Countermeasures on Residue Number System Multipliers 144
6.1 Introduction 144
6.2 Preliminaries 145
6.2.1 Power Analysis and Related Countermeasures 146
6.2.1.1 Power Analysis 146
6.2.1.2 Power Analysis Countermeasures 147
6.2.2 RNS Modular Multiplier 147
6.2.2.1 Residue Number System 147
6.2.2.2 RNS Modular Multiplication 149
6.2.2.3 Leakage Resistant Arithmetic 151
6.3 Attacks on the RNS Modular Multiplier 151
6.3.1 Attack Assumptions 151
6.3.2 Limited Randomness 152
6.3.3 Zero Collision Attack 153
6.3.4 Attacks on Mask Initialization 155
6.3.5 Channel Reduction Leakage 157
6.4 Countermeasures 158
6.4.1 Enlarged Coprime Pool 158
6.4.2 Plus-N Randomness 159
6.4.3 Initialization Shuffling 160
6.4.4 Random Padding 161
6.4.5 Channel Task Shuffling 161
6.5 Implementation 162
6.6 Discussion 164
6.7 Conclusion 164
References 164
7 Ultra-Low-Power Biomedical Circuit Design and Optimization: Catching the Don't Cares 166
7.1 Introduction 166
7.2 How Can We Beat the State of the Art? 168
7.3 Information Processing Capacity 169
7.3.1 Information-Theoretic Modeling 169
7.3.2 Soft Channel Selection 173
7.3.3 Robust Data Processing 174
7.4 Case Study: Brain–Computer Interface 176
7.4.1 System Design 176
7.4.2 Experimental Results 177
7.5 Summary 179
References 179
8 Acceleration of MapReduce Framework on a Multicore Processor 181
8.1 Introduction 181
8.2 MapReduce Framework on Multicore Processors 182
8.2.1 Introduction to MapReduce 182
8.2.2 Related Work 182
8.2.3 Experimental Platform 184
8.3 Accelerating Algorithms Based on MapReduce in Multicore Processors 185
8.3.1 Acceleration of PageRank Algorithm 185
8.3.1.1 Math Model of PageRank 185
8.3.1.2 Hardware Accelerator for Pagerank 185
8.3.2 Acceleration of Naive-Bayes Algorithm 186
8.3.2.1 Math Model of Naive-Bayes Algorithm 187
8.3.2.2 Hardware Accelerator for Naive-Bayes 187
8.3.2.3 Task Mapping Scheme: Topo-MapReduce 188
8.4 Configurable MapReduce Acceleration Framework 190
8.4.1 High Throughput Data Transferring 191
8.5 Experiment Result Analysis 193
8.5.1 Pagerank with Hardware Accelerations 193
8.5.2 Topo-Mapreduce 193
8.5.3 Configurable Mapreduce Acceleration Framework 194
8.6 Conclusion 195
References 195
9 Adaptive Dynamic Range Compression for Improving Envelope-Based Speech Perception: Implications for Cochlear Implants 197
9.1 Introduction 197
9.2 Speech Processor in CI Devices 198
9.3 Vocoder-Based Speech Synthesis 200
9.4 Compression Scheme 201
9.4.1 The Static Envelope Compression Strategy 201
9.4.2 The Adaptive Envelope Compression Strategy 202
9.5 Experiments and Results 205
9.5.1 Experiment-1: The Speech Perception Performance of AEC in Noise 205
9.5.1.1 Subjects and Materials 205
9.5.1.2 Procedure 206
9.5.1.3 Results and Discussion 206
9.5.2 Experiment-2: The Speech Perception Performance of AEC in Reverberation 208
9.5.2.1 Subjects and Materials 208
9.5.2.2 Procedure 208
9.5.2.3 Results and Discussion 208
9.5.3 Experiment-3: The Effect of Adaptation Rate on the Intelligibility of AEC-Processed Speech 210
9.5.3.1 Subjects and Materials 210
9.5.3.2 Procedure 210
9.5.3.3 Results and Discussion 211
9.5.4 Experiment-4: The Effect of Joint Envelope Compression and Noise Reduction 213
9.5.4.1 Subjects and Materials 213
9.5.4.2 Signal Processing with NR and Envelope Compression 214
9.5.4.3 Procedure 215
9.5.4.4 Results and Discussion 216
9.6 Summary 217
References 218
Part III Emerging Technology, Circuits and Systems for Data-Analytics 221
10 Neuromorphic Hardware Acceleration Enabled by EmergingTechnologies 222
10.1 Introduction 222
10.2 Background 224
10.2.1 Neural Network 224
10.2.2 Memristor Preliminaries 225
10.2.3 Memristor Array 226
10.3 Design Methodology 227
10.3.1 Weight Mapping 227
10.3.1.1 Mapping Method for BSB System 227
10.3.1.2 Mapping Method for Feedforward System 229
10.3.2 Training Algorithm Optimization 230
10.3.3 Recall Component Optimization 232
10.3.3.1 BSB Recall Implementation 232
10.3.3.2 FFW Active Function Implementation 233
10.4 Simulation and Evaluation 236
10.4.1 BSB System Evaluation 236
10.4.1.1 BSB training 236
10.4.1.2 BSB Recall 241
10.4.2 FFW System Evaluation 244
10.4.2.1 FFW Recall 245
10.5 Conclusion 247
References 247
11 Energy Efficient Spiking Neural Network Designwith RRAM Devices 250
11.1 Introduction 250
11.2 Preliminaries 252
11.2.1 Spike Neurons 252
11.2.2 RRAM Device Characteristics 253
11.3 Training Scheme of SNN 255
11.3.1 Spike Timing Dependent Plasticity (STDP) 255
11.3.2 Remote Supervision Method (ReSuMe) 256
11.3.3 Neural Sampling Learning Scheme 256
11.4 RRAM-Based Spiking Learning System 257
11.4.1 Unsupervised Feature Extraction+Supervised Classifier 257
11.4.2 Transferring ANN to SNN: Neural Sampling Method 259
11.4.3 Discussion on How to Boost the Accuracy of SNN 261
11.5 Conclusion 262
References 263
12 Efficient Neuromorphic Systems and Emerging Technologies: Prospects and Perspectives 265
12.1 Introduction 265
12.2 Neural Network Basics 266
12.3 General Purpose Computing Architecture 268
12.4 Underlying Device Physics 270
12.5 Proposals for Spintronic Neuromimetic Devices 272
12.6 Crossbar based ``In-Memory'' Computing Architecture 274
12.7 Conclusions 277
References 277
13 In-Memory Data Compression Using ReRAMs 279
13.1 LZ77 Compression Algorithm 280
13.2 ReVAMP Architecture for In-Memory Computing 281
13.2.1 Comparator Design 284
13.2.1.1 Analysis 286
13.2.2 Priority Multiplexer Design 286
13.3 LZ77 Compression Using ReVAMP 287
13.4 Performance Estimation 292
13.5 Related Works 292
13.6 Summary 293
References 293
14 Big Data Management in Neural Implants: The NeuromorphicApproach 296
14.1 Introduction: Brain as a Source of Big Data 296
14.2 The Nature of Neural Data 297
14.3 System Architectures for Neural Spike Recording Systems: Neuromorphic Compression Schemes 298
14.3.1 Compression Mode 1: Spike Detection 300
14.3.2 Compression Mode 2: Spike Sorting 302
14.3.3 Compression Mode 3: Intention Decoding 303
14.3.3.1 Algorithm: Extreme Learning Machine 303
14.3.3.2 Chip Architecture 305
14.3.3.3 Measurement Results 307
14.4 Conclusion and Discussions 310
References 311
15 Data Analytics in Quantum Paradigm: An Introduction 315
15.1 Introduction 315
15.1.1 Basics of a Qubit and the Algebra 316
15.1.2 Quantum Gates 317
15.1.3 No Cloning 318
15.2 A Brief Overview of Advantages in Quantum Paradigm 320
15.2.1 Teleportation 320
15.2.2 Deutsch-Jozsa Algorithm 321
15.3 Preliminaries of Quantum Cryptography 322
15.3.1 Quantum Key Distribution and the BB84 Protocol 324
15.3.2 Secure Multi-Party Computation 325
15.4 Data Analytics: A Critical View of Quantum Paradigm 326
15.4.1 Related Quantum Algorithms 326
15.4.2 Database 327
15.4.3 Text Mining 328
15.5 Conclusion: Google, PageRank, and Quantum Advantage 329
References 330

Erscheint lt. Verlag	19.4.2017
Zusatzinfo	XI, 330 p. 162 illus., 98 illus. in color.
Verlagsort	Cham
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik
	Technik ► Elektrotechnik / Energietechnik
	Wirtschaft
Schlagworte	Big Data Analytics • exascale computing • next-generation data analytics • non-volatile memory based hardware • Real-Time Big Data Analytics
ISBN-10	3-319-54840-9 / 3319548409
ISBN-13	978-3-319-54840-1 / 9783319548401

Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 12,7 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

160,49 €