Embedded DSP Processor Design (eBook)
808 Seiten
Elsevier Science (Verlag)
978-0-08-056987-1 (ISBN)
Coverage includes design of internal-external data types, application specific instruction sets, micro architectures, including designs for datapath and control path, as well as memory sub systems. Integration and verification of a DSP-ASIP processor are discussed and reinforced with extensive examples.
FOR INSTRUCTORS: To obtain access to the solutions manual for this title simply register on our textbook website (textbooks.elsevier.com)and request access to the Computer Science or Electronics and Electrical Engineering subject area. Once approved (usually within one business day) you will be able to access all of the instructor-only materials through the ';Instructor Manual'; link on this book's full web page.
* Instruction set design for application specific processors based on fast application profiling
* Micro architecture design methodology
* Micro architecture design details based on real examples
* Extendable architecture design protocols
* Design for efficient memory sub systems (minimizing on chip memory and cost)
* Real example designs based on extensive, industrial experiences
This book provides design methods for Digital Signal Processors and Application Specific Instruction set Processors, based on the author's extensive, industrial design experience. Top-down and bottom-up design methodologies are presented, providing valuable guidance for both students and practicing design engineers. Coverage includes design of internal-external data types, application specific instruction sets, micro architectures, including designs for datapath and control path, as well as memory sub systems. Integration and verification of a DSP-ASIP processor are discussed and reinforced with extensive examples. - Instruction set design for application specific processors based on fast application profiling- Micro architecture design methodology- Micro architecture design details based on real examples- Extendable architecture design protocols- Design for efficient memory sub systems (minimizing on chip memory and cost)- Real example designs based on extensive, industrial experiences
Front Cover 1
Embedded DSP Processor Design 4
Copyright Page 5
Table of Contents 8
Preface 20
List of Trademarks and Product Names 26
Chapter 1. Introduction 28
1.1 How to Read the Book 28
1.2 DSP Theory for Hardware Designers 32
1.2.1 Review of DSP Theory and Fundamentals 32
1.2.2 ADC and Finite-length Modeling 33
1.2.3 Digital Filters 35
1.2.4 Transform 37
1.2.5 Adaptive Filter and Signal Enhancement 39
1.2.6 Random Process and Autocorrelation 41
1.3 Theory, Applications, and Implementations 42
1.4 DSP Applications 44
1.4.1 Real-Time Concept 44
1.4.2 Communication Systems 44
1.4.3 Multimedia Signal Processing Systems 46
1.4.4 Review on Applications 50
1.5 DSP Implementations 51
1.5.1 DSP Implementation on GPP 52
1.5.2 DSP Implementation on GP DSP Processors 52
1.5.3 DSP Implementation on ASIP 53
1.5.4 DSP Implementation on ASIC 53
1.5.5 Trade-off and Decision of Implementations 55
1.6 Review of Processors and Systems 56
1.6.1 DSP Processor Architecture 56
1.6.2 DSP Firmware 57
1.6.3 Embedded System Overview 59
1.6.4 DSP in an Embedded System 61
1.6.5 Fundamentals of Embedded Computing 62
1.7 Design Flow 63
1.7.1 Hardware Design Flow in General 63
1.7.2 ASIP Hardware Design Flow 65
1.7.3 ASIP Design Automation 67
1.8 Conclusions 70
Exercises 71
References 72
Chapter 2. Numerical Representation and Finite-Length DSP 74
2.1 Fixed-Point Numerical Representation 74
2.1.1 An Intuitive Example 75
2.1.2 Fixed-Point Numerical Representation 77
2.1.3 Fixed-Point Binary Representation 78
2.1.4 Integer Binary Representation 79
2.1.5 Fractional Binary Representation 80
2.1.6 Fixed-Point Operands 81
2.1.7 Integer or Fractional 82
2.1.8 Other Binary Data Formats 90
2.2 Data Quality Measure 92
2.2.1 Noise, Distortion, Dynamic Range, and Precision 92
2.2.2 Quantitative Concept of Dynamic Range and Precision 95
2.3 Floating-Point Numerical Representation 96
2.4 Block Floating-Point 100
2.5 DSP Based on Finite Precision 103
2.5.1 The Way of Quantization—Rounding and Truncation 103
2.5.2 Overflow Saturation and Guards 105
2.5.3 Requirements on Guards 108
2.5.4 Execution Order 109
2.6 Examples of Corner Cases 109
2.7 Conclusions 110
Exercises 111
References 112
Chapter 3. DSP Architectures 114
3.1 DSP Subsystem Architecture 114
3.2 Processor Architecture 115
3.2.1 Inside a DSP Subsystem 116
3.2.2 DSP (Memory Bus) Architecture 118
3.2.3 Functional Description at Top Architecture Level 122
3.2.4 DSP Architecture Design 124
3.3 Inside a DSP Core 128
3.3.1 The Datapath and Register Bus 128
3.3.2 MAC 128
3.3.3 ALU 130
3.3.4 Register File 131
3.3.5 Control Path 132
3.3.6 Address Generator (AGU) 135
3.4 The Difference between GPP and ASIP DSP 136
3.4.1 The Difference between Designing a GPP and ASIP DSP 136
3.4.2 Comparing DSP Processors to Other Processors 137
3.4.3 CISC or RISC 140
3.5 Advanced DSP Architecture 143
3.5.1 DSP with Extreme Specification 143
3.5.2 ILP DSP Processors 147
3.5.3 Dual MAC and SIMD 149
3.5.4 VLIW and Superscalar 155
3.5.5 On-Chip Multicore DSP 172
3.6 Conclusions 180
Exercises 181
References 184
Chapter 4. DSP ASIP Design Flow 186
4.1 Design and Use of ASIP 186
4.1.1 What Is ASIP? 186
4.1.2 DSP ASIP Design Flow 187
4.2 Understanding Applications through Profiling 189
4.3 Architecture Selection 190
4.3.1 General Methodology 190
4.3.2 Architectures 195
4.3.3 Quantitative Approach 199
4.4 Designing Instruction Sets 200
4.5 Designing the Toolchain 201
4.6 Microarchitecture Design 205
4.7 Firmware Design 206
4.7.1 Real-time Firmware 207
4.7.2 Firmware with Finite Precision 208
4.7.3 Firmware Design Flow for One Application 208
4.7.4 Firmware Design Flow for Multiapplications 210
4.8 Conclusions 211
Exercises 211
References 212
Chapter 5. A Simple DSP Core—The Junior Processor 214
5.1 Junior—A Simple DSP Processor 214
5.2 Instruction Set and Operations 215
5.2.1 Load / Store Instructions 215
5.2.2 Addressing for Data Memory Access 217
5.2.3 Instructions for Basic Arithmetic Operations 217
5.2.4 Logic and Shift Operations 218
5.2.5 Program Flow Control Instructions 219
5.3 Assembly Coding 221
5.4 Assembly Benchmarking 224
5.4.1 Benchmarking of Block Transfer 226
5.4.2 Benchmarking of Single-Sample FIR 226
5.4.3 Benchmarking of Frame FIR 228
5.4.4 Benchmarking of Single-Sample Biquad IIR 231
5.4.5 Benchmarking of 16-bit Division 232
5.4.6 Benchmarking of Vector Maximum Tracking 233
5.4.7 Benchmarking of 8 8 DCT 234
5.4.8 Benchmarking of 256-point FFT 237
5.4.9 Benchmarking of Windowing 238
5.5 Discussion of Junior DSP 239
5.6 Conclusions 241
Exercises 242
References 242
Chapter 6. Code Profiling for ASIP Design 244
6.1 Source Code Profiling 244
6.1.1 What Is Source Code Profiling? 245
6.1.2 Why Profiling? 247
6.1.3 What to Profile 248
6.1.4 How to Profile 251
6.1.5 The Language to Profile 252
6.2 Static Profiling 253
6.2.1 Dynamic and Static Profiling 253
6.2.2 Static Profiling 253
6.2.3 Fine-grained Static Profiling 254
6.2.4 Coarse-grained Static Profiling 256
6.3 Dynamic Profiling 258
6.3.1 Instrumentation for Coarse-grained Profiling 258
6.3.2 Instrumentation for Fine-grained Profiling 258
6.3.3 Implement Instrumentation 259
6.4 Use of Reference Assembly Codes 261
6.4.1 Expose Hidden Costs 261
6.4.2 Understanding Assembly Codes 262
6.5 Quality Evaluation of Results 263
6.5.1 Evaluating Results of Source Code Profiling 263
6.5.2 Using Profiling Results 263
6.6 Conclusions 264
Exercises 264
References 264
Chapter 7. Assembly Instruction Set Design 266
7.1 Methodology 266
7.1.1 Opportunities and Constraints 266
7.1.2 Classification of General Instructions 271
7.1.3 Design of General RISC Subset Instructions 272
7.1.4 Specify CISC Instructions 275
7.1.5 For Undergraduates: From Junior to Senior 276
7.2 Designing RISC Subset Instructions 277
7.2.1 Data Access Instructions 277
7.2.2 Basic Arithmetic Instructions 283
7.2.3 Unsigned ALU Instructions 291
7.2.4 Program Flow Control Instructions 292
7.3 CISC Subset Instructions 298
7.3.1 MAC and Multiplication Instructions 298
7.3.2 Double-Precision Arithmetic Instructions 301
7.3.3 Other CISC Instructions 304
7.4 Accelerated Extensions 304
7.4.1 Challenges 304
7.4.2 Methodology 305
7.5 Instructions for Instruction Level Parallel (ILP) Architecture 307
7.5.1 Superscalar 307
7.5.2 VLIW Instructions 307
7.5.3 SIMD Instructions 309
7.6 Memory and Register Addressing 313
7.6.1 Register Addressing 314
7.6.2 Data Memory Addressing 317
7.6.3 Hardware Accelerated Memory Addressing 322
7.7 Coding 328
7.7.1 Assembly Encoding 328
7.7.2 Machine Code Coding 331
7.7.3 Examples 333
7.8 Conclusions 336
Exercises 337
References 339
Chapter 8. Software Development Toolchain 342
8.1 What Is Toolchain and IDE? 342
8.1.1 ASIP User’s View on IDE 343
8.1.2 ASIP Designer’s View on IDE 344
8.2 Code Analysis 345
8.2.1 Lexical Analysis 346
8.2.2 Syntax Analysis 346
8.2.3 Semantic Analysis 350
8.3 Profiler and WCET Analyzer 351
8.4 Compiler Overview 353
8.4.1 Intermediate Code Generation 353
8.4.2 Code Optimization 355
8.4.3 Code Generation 359
8.4.4 Error Handler 361
8.4.5 Compiler Generator and Verification of a Generated Compiler 362
8.5 Assembler 362
8.6 Linker 364
8.7 Simulator and Debugger Basics 366
8.7.1 Instruction Set Simulator (ISS) 368
8.7.2 Processor Simulator 376
8.7.3 Architecture Simulator 377
8.8 Debugger and GUI 377
8.8.1 Debugger 377
8.8.2 SW Debugging 378
8.8.3 GUI 379
8.9 Evaluation of Programming Tools 380
8.10 Conclusions 381
Exercises 381
References 382
Chapter 9. Evaluation of an Instruction Set 384
9.1 Benchmarking 384
9.1.1 Benchmarking DSP Kernel Algorithms 387
9.1.2 Some Benchmarking Examples 392
9.2 Instruction Use Profiling 392
9.3 Coverage Analysis 393
9.4 Conclusions 393
References 394
Chapter 10. Design of DSP Microarchitecture 396
10.1 Introduction to Microarchitecture 396
10.1.1 Microarchitecture versus Architecture 396
10.1.2 Microarchitecture Design 397
10.2 Microarchitecture-level Components 397
10.2.1 Basic Logic Components 398
10.2.2 Arithmetic Components 400
10.3 Hardware Design Fundamentals 401
10.3.1 Function Partitioning 401
10.3.2 Function Allocation 402
10.3.3 HW Multiplexing 403
10.3.4 Scheduling of Hardware Execution 406
10.3.5 Modeling and Simulation 408
10.4 Functional Specification at Microarchitecture Level 408
10.4.1 Intermodule Block Diagram 408
10.4.2 Microarchitecture Schematic 409
10.4.3 Module Functional Flowchart 409
10.4.4 Finite State Machine 414
10.4.5 Truth Table for Coding and Decoding 416
10.5 ASIP Microarchitecture Design Flow 417
10.5.1 Exposing Microoperations 418
10.5.2 Allocation and Partitioning of Microoperations 418
10.5.3 Pipeline Scheduling Microoperations 420
10.5.4 HW Multiplexing of Microoperations 420
10.5.5 Microoperations Integration 421
10.6 Conclusions 423
Exercises 423
References 424
Chapter 11. Design of Register File and Register Bus 426
11.1 Datapath 426
11.2 Design of Register Files 427
11.2.1 General Register File 427
11.2.2 Design of a Simple Register File 428
11.2.3 Pipeline around Register File 430
11.2.4 Special Registers in a General Register File 431
11.3 Design of Advanced Register Files 433
11.3.1 Register File for Cluster Datapath 433
11.3.2 Ultra Large Register File 435
11.4 Conclusions 437
Exercises 437
References 438
Chapter 12. ALU HW Implementation 440
12.1 Arithmetic and Logic Unit (ALU) 440
12.2 Design of Arithmetic Unit (AU) 442
12.2.1 Implementation Methodology 442
12.2.2 Select Kernel Components 443
12.2.3 Implementing Simple AU Instructions 445
12.2.4 Implementing Special AU Instructions 450
12.3 Shift and Rotation 453
12.3.1 Design a Shifter Using a Shifter Primitive 454
12.3.2 Design a Shifter Using Truth Tables 457
12.3.3 Logic Operation and Data Manipulation 457
12.4 ALU Integration 460
12.4.1 Preprocessing and Postprocessing 460
12.4.2 ALU Integration 460
12.5 Conclusions 461
Exercises 462
References 465
Chapter 13. MAC Hardware Implementation 466
13.1 Introduction 466
13.1.1 Review of Convolution 466
13.1.2 MAC Fundamentals 467
13.2 MAC Implementation 469
13.2.1 MAC Instructions 469
13.2.2 Implementing Multiplications 469
13.2.4 Implementing Double-Precision Instructions 476
13.2.3 Implementing MAC Instructions 473
13.2.5 Accessing ACR Context 478
13.2.6 Flag Operations and Other Postoperations 482
13.3 A MAC Design Case 483
13.4 MAC Integrations 492
13.4.1 Physical Critical-Path 492
13.4.2 Pipeline in a MAC 493
13.5 Dual MAC, Multiple MAC, and VLIW 495
13.6 Conclusions 497
Exercises 498
References 501
Chapter 14. Control Path Design 502
14.1 Control Paths 502
14.2 Control Path Organization 503
14.2.1 Pipeline Consideration 505
14.2.2 Interrupt Management 510
14.3 Control Path Hardware Design 513
14.3.1 Top-level Structure 513
14.3.2 Design of Program Memory and Peripherals 515
14.3.3 Loading Code 516
14.3.4 Instruction Flow Controller 518
14.3.5 Loop Controller 521
14.3.6 PC Stack 523
14.3.7 Senior PC FSM Example 526
14.4 Instruction Decoder 529
14.4.1 Control Signal Decoding 530
14.4.2 Decoding Order 532
14.4.3 Decoding for Exception, Interrupt, Jump, and Conditional Execution 532
14.4.4 Issues of Multicycle Execution 533
14.4.6 Decoding for Superscalar 536
14.4.5 VLIW Machine Decoding 535
14.5 Conclusions 537
Exercises 537
References 539
Chapter 15. Design of Memory Subsystems 540
15.1 Memory and Peripherals 540
15.1.1 Memory Modules 540
15.1.2 Memory Peripheral Circuits 544
15.2 Design of Memory Addressing Circuitry 551
15.2.1 General Addressing Circuit 551
15.2.2 Modulo Addressing Circuit 554
15.3 Buses 558
15.4 Memory Hierarchy 559
15.4.1 Problems 559
15.4.2 Memory Hierarchy of DSP Processors 560
15.5 DMA 562
15.5.1 DMA Concepts 562
15.5.2 Configuring a Program for a DMA Task 566
15.5.3 A SoC View 570
15.6 Conclusions 570
Exercises 570
References 572
Chapter 16. DSP Core Peripherals 574
16.1 Peripherals 574
16.2 Design a Peripheral Module 576
16.2.1 Design of a Common Interface in Peripheral Modules 577
16.2.2 Protocol Design of Peripheral Modules 581
16.3 Interrupt Handler 582
16.3.1 Interrupt Basics 582
16.3.2 Interrupt Sources 582
16.3.3 Interrupt Requests 584
16.3.4 Interrupt Handling Process 585
16.3.5 A Case Study 588
16.4 Timers 594
16.5 Direct Memory Access (DMA) 597
16.5.1 DMA Basics 597
16.5.2 Design a Simple DMA 600
16.5.3 Advanced DMA Controller 608
16.5.4 DMA Benchmarking 616
16.6 Serial Ports 616
16.6.1 Bit Synchronization 616
16.6.2 Packet Synchronization 619
16.6.3 Arbitration 620
16.6.4 Control of a Serial Port 621
16.7 Parallel Ports 621
16.8 Conclusions 621
Exercises 622
References 623
Chapter 17. Design for DSP Functional Acceleration 624
17.1 Functional Acceleration 624
17.1.1 Loosely Connected Accelerator 625
17.1.2 Tightly Connected Accelerator 626
17.2 Accelerator Specification 628
17.2.1 Principle 628
17.2.2 An Accelerator with One Single Instruction 628
17.2.3 An Accelerator with Multiple Instructions 629
17.2.4 An Accelerator as a Slave Processor 630
17.3 Scalable Processor and Accelerator Interface 631
17.3.1 Configurability and Extendibility 631
17.3.2 Extendible Hardware Interface 635
17.3.3 Extendible Programmer Tools 638
17.4 Accelerator Design Flow 643
17.5 Conclusions 643
Exercises 644
References 645
Chapter 18. Real-time Fixed-point DSP Firmware 646
18.1 Firmware (FW) 646
18.2 Application Modeling under HW Constraints 647
18.2.1 Understanding Applications 647
18.2.2 Understanding Hardware 651
18.2.3 Algorithm Selection 653
18.2.4 Language Selection 660
18.2.5 Real-time Firmware Implementation 662
18.2.6 Firmware for Fixed-point Data 665
18.3 Assembly Implementation 673
18.3.1 General Flow and C-Compiling 673
18.3.2 Plan and Specify for Assembly Coding 674
18.3.3 Fixed-point Assembly Kernels 675
18.3.4 Low Cycle Cost Assembly Coding 676
18.3.5 Storage Efficient Assembly Kernels 679
18.3.6 Function Libraries 683
18.3.7 Optimize Control Codes 685
18.4 Assembly-level Integration and Release 686
18.5 Conclusions 688
References 688
Chapter 19. ASIP Integration and Verification 690
19.1 Integration 690
19.1.1 HW Integration of an ASIP Core 692
19.1.2 Integration of a DSP Subsystem and a DSP Processor 695
19.1.3 HW Integration of a SoC 702
19.1.4 Integration of SoC Simulator 712
19.2 Functional Verification 713
19.2.1 The Basics 713
19.2.2 Verification Process 716
19.2.3 Verification Techniques 718
19.2.4 Speed-up Verification 724
19.2.5 Simulation or Emulation 726
19.2.6 Verification of an ASIP 727
19.2.7 Writing Testbench 727
19.3 Conclusions 728
Exercises 730
References 730
Chapter 20. Parallel Streaming Signal Processing 732
20.1 Streaming DSP 732
20.1.1 Streaming Signals 732
20.1.2 Parallel Streaming DSP Processors 732
20.2 Parallel Architecture, Divide and Conquer 734
20.2.1 Review of Parallel Architectures 734
20.2.2 Divide and Conquer 737
20.3 Expose Control Complexities 739
20.3.1 General Control Handling 739
20.3.2 Exposing Challenges 740
20.3.3 SIMT Architecture for Low-level Parallel Applications 743
20.3.4 Design of Multicore DSP Subsystems 748
20.4 Streaming Data Manipulations 753
20.4.1 Data Complexity of Streaming DSP 753
20.4.2 Data Complexity: Case 1—Video 753
20.4.3 Data Complexity: Case 2—Radio Baseband 759
20.5 NoC for Parallel Memory Access 762
20.5.1 Design Methods 762
20.5.2 Analyses of Parallel Memory Access for NoC Design 763
20.6 Parallel Memory Architecture 766
20.6.1 Requirements for Parallel Algorithms 766
20.6.2 Cache 767
20.6.3 Ultra-large Register File 770
20.7 P3RMA for Streaming DSP Processors 771
20.7.1 Parallel Vector (Scratchpad) Memories 772
20.7.2 The Memory Subsystem Hardware 774
20.7.3 Parallel Programming by Hand 775
20.7.4 Programming Toolchain for P3RMA 781
20.8 Conclusions 784
References 785
Glossary 788
Appendix A. Senior Assembly InstructionSet Manual 796
Index 798
Erscheint lt. Verlag | 9.7.2008 |
---|---|
Sprache | englisch |
Themenwelt | Sachbuch/Ratgeber |
Mathematik / Informatik ► Informatik ► Theorie / Studium | |
Naturwissenschaften ► Physik / Astronomie ► Elektrodynamik | |
Technik ► Elektrotechnik / Energietechnik | |
Technik ► Nachrichtentechnik | |
ISBN-10 | 0-08-056987-0 / 0080569870 |
ISBN-13 | 978-0-08-056987-1 / 9780080569871 |
Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich