New Algorithms, Architectures and Applications for Reconfigurable Computing (eBook)

eBook Download: PDF
2005 | 2005
XVIII, 314 Seiten
Springer US (Verlag)
978-1-4020-3128-1 (ISBN)

Lese- und Medienproben

New Algorithms, Architectures and Applications for Reconfigurable Computing -
Systemvoraussetzungen
149,79 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

New Algorithms, Architectures and Applications for Reconfigurable Computing consists of a collection of contributions from the authors of some of the best papers from the Field Programmable Logic conference (FPL'03) and the Design and Test Europe conference (DATE'03). In all, seventy-nine authors, from research teams from all over the world, were invited to present their latest research in the extended format permitted by this special volume. The result is a valuable book that is a unique record of the state of the art in research into field programmable logic and reconfigurable computing.

The contributions are organized into twenty-four chapters and are grouped into three main categories: architectures, tools and applications. Within these three broad areas the most strongly represented themes are coarse-grained architectures; dynamically reconfigurable and multi-context architectures; tools for coarse-grained and reconfigurable architectures; networking, security and encryption applications.

Field programmable logic and reconfigurable computing are exciting research disciplines that span the traditional boundaries of electronic engineering and computer science. When the skills of both research communities are combined to address the challenges of a single research discipline they serve as a catalyst for innovative research. The work reported in the chapters of this book captures that spirit of that innovation.


New Algorithms, Architectures and Applications for Reconfigurable Computing consists of a collection of contributions from the authors of some of the best papers from the Field Programmable Logic conference (FPL'03) and the Design and Test Europe conference (DATE'03). In all, seventy-nine authors, from research teams from all over the world, were invited to present their latest research in the extended format permitted by this special volume. The result is a valuable book that is a unique record of the state of the art in research into field programmable logic and reconfigurable computing.The contributions are organized into twenty-four chapters and are grouped into three main categories: architectures, tools and applications. Within these three broad areas the most strongly represented themes are coarse-grained architectures; dynamically reconfigurable and multi-context architectures; tools for coarse-grained and reconfigurable architectures; networking, securityand encryption applications.Field programmable logic and reconfigurable computing are exciting research disciplines that span the traditional boundaries of electronic engineering and computer science. When the skills of both research communities are combined to address the challenges of a single research discipline they serve as a catalyst for innovative research. The work reported in the chapters of this book captures that spirit of that innovation.

Contents 5
Introduction 9
About the Editors 15
Acknowledgements 17
Architectures 18
1 Extra-dimensional Island-Style FPGAs Herman Schmit 20
1.1 Architecture 22
1.2 Experimental Evaluation 26
1.3 Time Multiplexing and Forward-compatiblity 28
1.4 Conclusions 29
References 29
2 A Tightly Coupled VLIW/Reconfigurable Matrix and its Modulo Scheduling Technique 32
2.1 Introduction 32
2.2 ADRES Architecture 33
2.2.1 Architecture Description 33
2.2.2 Improved Performance with the VLIW Processor 35
2.2.3 Simplified Programming Model and Reduced Communication Cost 36
2.2.4 Resource Sharing 36
2.3 Modulo Scheduling 37
2.3.1 Problem Illustrated 37
2.3.2 Modulo Routing Resource Graph 38
2.3.3 Modulo Scheduling Algorithm 40
2.4 Experimental Results 42
2.5 Conclusions and Future Work 44
References 44
3 Stream-based XPP Architectures in Adaptive System-on-Chip Integration 46
3.1 Introduction 46
3.2 Stream-based XPP Architecture 48
3.2.1 Array Concept and Datapath Structure 49
3.2.2 Stream Processing and Selfsynchronization 49
3.2.3 Configuration Handling 50
3.3 Adaptive XPP-based System-on-Chip 50
3.4 XPP64A: First-Time-Right-Silicon 54
3.5 Application Evaluation—Examples 56
3.6 Conclusions 57
References 58
4 Core-Based Architecture for Data Transfer Control in SoC Design 60
4.1 Introduction 60
4.2 Digital Systems with Very Time Consuming Data Exchange Requirements. Design Alternatives 61
4.3 System on a Reprogrammable Chip Design Methodology 63
4.4 SoRC Core-Based Architecture 64
4.4.1 Communication Bus IP Cores 65
4.4.2 Data Transfer Bus IP Cores 65
4.4.3 Main Processor Bus IP Cores 68
4.5 Verification and Analysis User Interface 68
4.6 Results and Conclusions 69
References 70
5 Customizable and Reduced Hardware Motion Estimation Processors 72
5.1 Introduction 72
5.2 Base FSBM Architecture 74
5.3 Architectures for Limited Resources Devices 75
5.3.1 Decimation at the Pixel Level 76
5.3.2 Reduction of the Precision of the Pixel Values 78
5.4 Implementation and Experimental Results 78
5.5 Conclusion 82
References 83
Methodologies and Tools 84
6 Enabling Run-time Task Relocation on Reconfigurable Systems 86
6.1 Hardware/Software Multitasking on a Reconfigurable Computing Platform 87
6.2 Uniform Communication Scheme 89
6.3 Unified Design of Hardware and Software with OCAPI-xl 91
6.4 Heterogeneous Context Switch Issues 92
6.5 Relocatable Video Decoder 94
6.5.1 The T-ReCS Gecko Demonstrator 94
6.5.2 The Video Decoder 94
6.5.3 Results 95
6.6 Conclusions 96
References 96
7 A Unified Codesign Environment 98
7.1 Related Work 99
7.2 System Architecture 100
7.2.1 Task Model 101
7.2.2 Task Manager Program 102
7.3 Codesign Environment 103
7.4 Implementation in the UltraSONIC Platform 105
7.5 A Case Study of FFT Algorithm 106
7.6 Conclusions 107
References 108
8 Mapping Applications to a Coarse Grain Reconfigurable System 110
8.1 Introduction 110
8.2 The Target Architecture: MONTIUM 111
8.3 A Four-Phase Decomposition 112
8.4 Translating C to a CDFG 113
8.5 Clustering 114
8.6 Scheduling 115
8.7 Allocation 117
8.8 Conclusion 119
8.9 Related work 119
References 120
9 Compilation and Temporal Partitioning for a Coarse-grain Reconfigurable Architecture 122
9.1 Introduction 122
9.2 The XPP Architecture and the Configure-Execute Paradigm 123
9.3 Compilation 125
9.4 Experimental Results 128
9.5 Related Work 130
9.6 Conclusions 130
References 131
10 Run-time Defragmentation for Dynamically Reconfigurable Hardware 134
Introduction 135
Dynamic Relocation 138
Rearranging Routing Resources 143
Conclusion 145
References 145
11 Virtual Hardware Byte Code as a Design Platform for Recon.gurable Embedded Systems 148
11.1 Introduction 148
11.1.1 State of the Art 150
11.1.2 Our Approach 152
11.2 The Virtual Hardware Byte Code 152
11.3 The Byte Code Compiler 154
11.4 The Virtual Hardware Machine 155
11.5 Results 157
11.6 Conclusions and Future Work 159
References 159
12 A Low Energy Data Management for Multi-Context Reconfigurable Architectures 162
12.1 Introduction 162
12.2 Architecture and Framework Overview 164
12.3 Problem Overview 165
12.4 Low Energy RC-RAM Management 166
12.5 Low Energy FB Management 168
12.6 Low Energy CM Management 169
12.7 Experimental Results 170
12.8 Conclusions 171
References 172
13 Dynamic and Partial Reconfiguration in FPGA SoCs: Requirements Tools and a Case Study 174
13.1 Introduction 174
13.2 Requirements for FPGA SoC DRSs 176
13.3 Tools for DRS 177
13.4 A DRS Case Study: Design and Experimental Results 179
13.5 Conclusions 183
References 184
Applications 186
14 Design Flow for a Reconfigurable Processor Implementation of a Turbo-decoder 188
14.1 Introduction 188
14.2 Related Work 190
14.3 Design Flow for the Reconfigurable Processor 191
14.4 Design Tools for the Reconfigurable Processor 194
14.5 Case Study: Turbo Decoding 196
14.6 Conclusions 198
References 198
15 IPsec-Protected Transport of HDTV over IP 200
15.1 Introduction 200
15.2 GRIP System Architecture 201
15.3 GRIP Hardware 203
15.3.1 Basic platform 203
15.3.2 X1/X2 IPsec Accelerator Cores 204
15.4 Integrating GRIP with the Operating System 204
15.5 Example Application: Encrypted Transport of HDTV over IP 206
15.5.1 Background 206
15.5.2 Design and Implementation 206
15.6 Related Work 207
15.7 Results 208
15.7.1 System Performance 208
15.7.2 Evaluating Hardware Implementations 209
15.8 Conclusions and Future Work 209
References 211
16 Fast, Large-scale String Match for a 10 Gbps FPGA-based NIDS 212
16.1 Introduction 212
16.2 Architecture of Pattern Matching Subsystem 214
16.2.1 Pipelined Comparator 215
16.2.2 Pipelined Encoder 216
16.2.3 Packet Data Fan-out 216
16.2.4 VHDL Generator 217
16.3 Evaluation Results 217
16.3.1 Performance 217
16.3.2 Cost: Area and Latency 219
16.4 Comparison with Previous Work 220
16.5 Conclusions and Future Work 221
References 224
17 Architecture and FPGA Implementation of a Digit-serial RSA Processor Alessandro Cilardo, Antonino Mazzeo, Luigi Romano, Giacinto Paolo Saggese 226
17.1 Algorithm Used for the RSA Processor 228
17.2 Architecture of the RSA Processor 229
17.3 FPGA Implementation and Performance Analysis 232
17.4 Related Work 234
17.5 Conclusions 234
References 235
18 Division in GF(p) for Application in Elliptic Curve Cryptosystems on Field Programmable Logic 236
18.1 Introduction 236
18.2 Elliptic Curve Cryptography over GF(p) 237
18.3 Modular Inversion 239
18.4 Modular Division 239
18.5 Basic Division Architecture 240
18.6 Proposed Carry-Select Division Architecture 241
18.7 Results 243
18.8 Conclusions 245
References 245
19 A New Arithmetic Unit in GF(2M) for Reconfigurable Hardware Implementation 248
19.1 Introduction 248
19.2 Mathematical Background 250
19.2.1 GF(2m) Field Arithmetic for ECC 250
19.2.2 GF(2m) Field Arithmetic for ECC 251
19.3 A New Dependence Graph for Both Division and Multiplication in GF(2m) 251
19.3.1 Dependence Graph for Division in GF(2m) 251
19.3.2 DG for MSB-.rst Multiplication in GF(2m) 256
19.3.3 A New DG for Both Division and Multiplication in GF(2m) 258
19.4 A New AU for Both Division and Multiplication in GF(2m) 260
19.5 Results and Conclusions 263
References 265
20 Performance Analysis of SHACAL-1 Encryption Hardware Architectures Maire McLoone, J.V. McCanny 268
20 Performance Analysis of SHACAL-1 Encryption Hardware Architectures 268
20.1 Introduction 268
20.2 A Description of the SHACAL-1 Algorithm 269
20.2.1 SHACAL-1 Decryption 271
20.3 SHACAL-1 Hardware Architectures 272
20.3.1 Iterative SHACAL-1 Architectures 272
20.3.2 Fully and Sub-Pipelined SHACAL-1 Architectures 276
20.4 Performance Evaluation 278
20.5 Conclusions 279
References 280
21 Security Aspects of FPGAs in Cryptographic Applications 282
21.1 Introduction and Motivation 282
21.2 Shortcomings of FPGAs for Cryptographic Applications 283
21.2.1 Why does Someone Wants to Attack FPGAs? 283
21.2.2 Description of the Black Box Attack 284
21.2.3 Cloning of SRAM FPGAs 284
21.2.4 Description of the Readback Attack 284
21.2.5 Reverse-Engineering of the Bitstreams 285
21.2.6 Description of Side Channel Attacks 286
21.2.7 Description of Physical Attacks 286
21.3 Prevention of Attacks 290
21.3.1 How to Prevent Black Box Attacks 291
21.3.2 How to Prevent Cloning of SRAM FPGAs 291
21.3.3 How to Prevent Readback Attacks 292
21.3.4 How to Prevent Side Channel Attack 292
21.3.5 How to Prevent Physical Attacks 293
21.4 Conclusions 293
References 294
22 Bioinspired Stimulus Encoder for Cortical Visual Neuroprostheses 296
22.1 Introduction 296
22.2 Model Architecture 298
22.2.1 Retina Early Layers 298
22.2.2 Neuromorphic Pulse Coding 300
22.3 FPL Implementation 301
22.3.1 The Retina Early Layers 301
22.3.2 Neuromorphic Pulse Coding 303
22.4 Experimental Results 304
22.5 Conclusions 306
References 307
23 A Smith-Waterman Systolic Cell 308
23.1 Introduction 308
23.2 The Smith-Waterman Algorithm 310
23.3 FPGA Implementation 312
23.4 Results 315
23.5 Conclusion 317
References 317
24 The Effects of Polynomial Degrees 318
24.1 Background 320
24.2 The Hierarchical Segmentation Method 321
24.3 The Effects of Polynomial Degrees 323
24.4 Evaluation and Results 327
24.5 Conclusion 329
References 330

11.3 The Byte Code Compiler (p.137- 138)

The Byte Code Compiler is a very important feature of the VHBC approach, because it provides the means to compile working hardware designs, coded as a VHDL description, into a portable and efficient VHBC representation, thus removing the need for redesigning working hardware projects. The tool flow within the VHDL compiler can basically be divided into three main stages, the hardware synthesis, the net list to byte code conversion and the byte code optimization and scheduling.


In the first stage the VHDL description is compiled into a net list of standard components and standard logic optimization is performed upon it, resulting in an optimized net list. The design of the compiler chain can be streamlined through the use of off-the-shelf hardware synthesis tools. Current implementations of the VHDL compiler make e.g. use of the FPGAExpress tool from Synopsis. These tools produce the anticipated code using a fairly standardized component library, as in the case of FPGA Express the SimPrim library from Xilinx. The resulting output of the first stage is converted to structural VHDL and passed on to the second stage. Most standard industry VHDL compilers with a support for FPGAs design readily provide the functionality needed for this step and can therefore be applied.

In the second stage the components of the net list are substituted by VHBC fragments to form aVHBCinstruction stream. Before, however, the components are mapped to a VHBC representation, the net list is analyzed and optimized for VHBC. The optimization is necessary because commercial compilers targeting FPGAs usually output designs which contain large amounts of buffers to enhance signal integrity otherwise impaired by the routing of the signals. Furthermore, compilers show a tendency towards employing logic representations based on NAND or NOR gates, which are more ef.cient when cast into silicon.

However, the resulting logic structure is more complex, revealing higher levels of logic. The code fragments used for substituting the logic components are based on predefined, general implementations of the latter in VHBC and are adjusted according to the data flow found in the structural description from the first phase, thus registers are allocated and the instructions are sequenced according to the data dependencies inherent.

In the third stage the byte code sequence is optimized and scheduled into blocks of independent instructions. First of all the data flow graph of the entire design is constructed, which is possible due to the lack of control flow instructions such as jumps. The code fragments introduced in the second stage are very general, so the resulting code gives a lot of room to code optimization techniques. One such technique is dead code elimination, which removes unnecessary instructions. The code is further optimized by applying predefined code substitution rules along the data paths, such as XOR extraction or doublenegation removal, to reduce the number of instructions and compact the code.

The thus optimized code is scheduled using a list based scheduling scheme [14]. The objective of the scheduling is to group the instructions into code blocks such that the number of code blocks is minimal and the number of instructions per code block is evenly distributed among all code blocks. Furthermore, the time of data not being used, i.e. the number of clock cycles between the calculation of a datum and its use in another operation should be minimal. The scheduled code is then converted to the VHBC image format and the compiler flow concludes.

Erscheint lt. Verlag 5.12.2005
Zusatzinfo XVIII, 314 p.
Verlagsort New York
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Software Entwicklung
Informatik Weitere Themen CAD-Programme
Technik Elektrotechnik / Energietechnik
Schlagworte Computer • Configuration • Embedded System • field programmable gate arrays • FPGA • Hardware • Logic • Performance • Programmable Logic • reconfigurable computing • rsa • System on chip (SoC) • Tools
ISBN-10 1-4020-3128-9 / 1402031289
ISBN-13 978-1-4020-3128-1 / 9781402031281
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 4,0 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Technologische Grundlagen und industrielle Praxis

von André Borrmann; Markus König; Christian Koch …

eBook Download (2021)
Springer Fachmedien Wiesbaden (Verlag)
89,99
Agilität kontinuierlich verbessern

von Irun D. Tosh

eBook Download (2024)
tredition (Verlag)
19,99