Introduction to Parallel Programming
Cambridge University Press (Verlag)
978-1-009-06953-3 (ISBN)
In modern computer science, there exists no truly sequential computing system; and most advanced programming is parallel programming. This is particularly evident in modern application domains like scientific computation, data science, machine intelligence, etc. This lucid introductory textbook will be invaluable to students of computer science and technology, acting as a self-contained primer to parallel programming. It takes the reader from introduction to expertise, addressing a broad gamut of issues. It covers different parallel programming styles, describes parallel architecture, includes parallel programming frameworks and techniques, presents algorithmic and analysis techniques and discusses parallel design and performance issues. With its broad coverage, the book can be useful in a wide range of courses; and can also prove useful as a ready reckoner for professionals in the field.
Dr Subodh Kumar is Professor at the Department of Computer Science and Engineering at the Indian Institute of Technology, Delhi; an institution he has been associated with since 2007. During this time, he has headed the High Performance Computing group of the institute, and taught several courses on computer graphics, data structures and algorithms, design practices in computer science and parallel programming. Previously, he held the post of Assistant Professor at the Johns Hopkins University. His research interests include rendering algorithms, virtual reality, geometry processing, human machine interface, visualization, large scale parallel computation and HPC.
List of Figures; Introduction; Concurrency and Parallelism; Why Study Parallel Programming; What is in this Book; 1. An Introduction to Parallel Computer Architecture; 1.1 Parallel Organization; SISD: Single Instruction, Single Data; SIMD: Single Instruction, Multiple Data; MIMD: Multiple Instruction, Multiple Data; MISD: Multiple Instruction, Single Data; 1.2 System Architecture; 1.3 CPU Architecture; 1.4 Memory and Cache; 1.5 GPU Architecture; 1.6 Interconnect Architecture; Routing; Links; Types and Quality of Networks; Torus Network; Hypercube Network; Cross-Bar Network; Shuffle-Exchange Network; Clos Network; Tree Network; Network Comparison; 1.7 Summary; 2. Parallel Programming Models; 2.1 Distributed-Memory Programming Model; 2.2 Shared-Memory Programming Model; 2.3 Task Graph Model; 2.4 Variants of Task Parallelism; 2.5 Summary; 3. Parallel Performance Analysis; 3.1 Simple Parallel Model; 3.2 Bulk-Synchronous Parallel Model; BSP Computation Time; BSP Example; 3.3 PRAM Model; PRAM Computation Time; PRAM Example; 3.4 Parallel Performance Evaluation; Latency and Throughput; Speed-up; Cost; Efficiency; Scalability; Iso-efficiency; 3.5 Parallel Work; Brent's Work-Time Scheduling Principle; 3.6 Amdahl's Law; 3.7 Gustafson's Law; 3.8 Karp–Flatt Metric; 3.9 Summary; 4. Synchronization and Communication Primitives; 4.1 Threads and Processes; 4.2 Race Condition and Consistency of State; Sequential Consistency; Causal Consistency; FIFO and Processor Consistency; Weak Consistency; Linearizability; 4.3 Synchronization; Synchronization Condition; Protocol Control; Progress; Synchronization Hazards; 4.4 Mutual Exclusion; Lock; Peterson's Algorithm; Bakery Algorithm; Compare and Swap; Transactional Memory; Barrier and Consensus; 4.5 Communication; Point-to-Point Communication; RPC; Collective Communication; 4.6 Summary; 5. Parallel Program Design; 5.1 Design Steps; Granularity; Communication; Synchronization; Load Balance; 5.2 Task Decomposition; Domain Decomposition; Functional Decomposition; Task Graph Metrics; 5.3 Task Execution; Preliminary Task Mapping; Task Scheduling Framework; Centralized Push Scheduling Strategy; Distributed Push Scheduling; Pull Scheduling; 5.4 Input/Output; 5.5 Debugging and Profiling; 5.6 Summary; 6. Middleware: The Practice of Parallel Programming; 6.1 OpenMP; Preliminaries; OpenMP Thread Creation; OpenMP Memory Model; OpenMP Reduction; OpenMP Synchronization; Sharing a Loop's Work; Other Work-Sharing Pragmas; SIMD Pragma; Tasks; 6.2 MPI; MPI Send and Receive; Message-Passing Synchronization; MPI Data Types; MPI Collective Communication; MPI Barrier; MPI Reduction; One-Sided Communication; MPI File IO; MPI Groups and Communicators; MPI Dynamic Parallelism; MPI Process Topology; 6.3 Chapel; Partitioned Global Address Space; Chapel Tasks; Chapel Variable Scope; 6.4 Map-Reduce; Parallel Implementation; Hadoop; 6.5 GPU Programming; OpenMP GPU Off-Load; Data and Function on Device; Thread Blocks in OpenMP; CUDA; CUDA Programming Model; CPU–GPU Memory Transfer; Concurrent Kernels; CUDA Synchronization; CUDA Shared Memory; CUDA Parallel Memory Access; False Sharing; 6.6 Summary; 7. Parallel Algorithms and Techniques; 7.1 Divide and Conquer: Prefix-Sum; Parallel Prefix-Sum: Method 1; Parallel Prefix-Sum: Method 2; Parallel Prefix-Sum: Method 3; 7.2 Divide and Conquer: Merge Two Sorted Lists; Parallel Merge: Method 1; Parallel Merge: Method 2; Parallel Merge: Method 3; Parallel Merge: Method 4; 7.3 Accelerated Cascading: Find Minima; 7.4 Recursive Doubling: List Ranking; 7.5 Recursive Doubling: Euler Tour; 7.6 Recursive Doubling: Connected Components; 7.7 Pipelining: Merge-Sort; Basic Merge-Sort; Pipelined Merges; 4-Cover Property Analysis; Merge Operation per Tick; 7.8 Application of Prefix-Sum: Radix-Sort; 7.9 Exploiting Parallelism: Quick-Sort; 7.10 Fixing Processor Count: Sample-Sort; 7.11 Exploiting Pa
Erscheinungsdatum | 23.02.2023 |
---|---|
Zusatzinfo | Worked examples or Exercises |
Verlagsort | Cambridge |
Sprache | englisch |
Maße | 185 x 241 mm |
Gewicht | 500 g |
Themenwelt | Mathematik / Informatik ► Informatik ► Theorie / Studium |
Mathematik / Informatik ► Mathematik ► Angewandte Mathematik | |
ISBN-10 | 1-009-06953-5 / 1009069535 |
ISBN-13 | 978-1-009-06953-3 / 9781009069533 |
Zustand | Neuware |
Haben Sie eine Frage zum Produkt? |
aus dem Bereich