High Performance Computing in Clouds

Moving HPC Applications to a Scalable and Cost-Effective Environment

Edson Borin, Lúcia Maria A. Drummond, Jean-Luc Gaudiot, Alba Melo, Maicon Melo Alves, Philippe Olivier Alexandre Navaux (Herausgeber)

Buch | Softcover

XV, 334 Seiten

2024 | 2023
Springer International Publishing (Verlag)
978-3-031-29771-7 (ISBN)

Artikel merken

This book brings a thorough explanation on the path needed to use cloud computing technologies to run High-Performance Computing (HPC) applications. Besides presenting the motivation behind moving HPC applications to the cloud, it covers both essential and advanced issues on this topic such as deploying HPC applications and infrastructures, designing cloud-friendly HPC applications, and optimizing a provisioned cloud infrastructure to run this family of applications. Additionally, this book also describes the best practices to maintain and keep running HPC applications in the cloud by employing fault tolerance techniques and avoiding resource wastage.

To give practical meaning to topics covered in this book, it brings some case studies where HPC applications, used in relevant scientific areas like Bioinformatics and Oil and Gas industry were moved to the cloud. Moreover, it also discusses how to train deep learning models in the cloud elucidating the key components andaspects necessary to train these models via different types of services offered by cloud providers.

Despite the vast bibliography about cloud computing and HPC, to the best of our knowledge, no existing manuscript has comprehensively covered these topics and discussed the steps, methods and strategies to execute HPC applications in clouds. Therefore, we believe this title is useful for IT professionals and students and researchers interested in cutting-edge technologies, concepts, and insights focusing on the use of cloud technologies to run HPC applications.

Edson Borin: Prof. Edson Borin is an associate professor at the Institute of Computing at the University of Campinas (Unicamp) and has been working there since 2010. Prior to joining Unicamp, he was a researcher at Intel Labs in California, where he developed dynamic compilation techniques to improve next-generation HW/SW co-designed microprocessors. He also used the microcode compression algorithms he had developed in his PhD thesis to enhance the manufacturing process of Intel microprocessors, earning four divisional recognition awards. At Unicamp, Prof. Borin applies his expertise in modern computer architecture and compilers to optimize the performance and cost of scientific and engineering computing. He leads the Discovery laboratory, which is supported by government agencies such as Fapesp, CNPq and Capes, international technology companies like Intel, AMD, Samsung, Motorola, and Cadence/Tensilica, and major Brazilian corporations such as Petrobras. Several of his researchworks have been particularly geared towards optimizing the execution of seismic-processing and deep-learning applications on cloud infrastructure. In addition to his research contributions, Prof. Borin has authored eight patents, a technical book on assembly programming, and over 100 papers in international conferences and journals. He has supervised over 22 doctoral and master's students, many of whom have received recognition for their exceptional theses, dissertations, and papers.

Lúcia Maria A. Drummond: Prof. Lucia Drummond obtained her D.Sc. in Systems Engineering and Computer Science from the Federal University of Rio deJaneiro, Brazil, in 1994, where she took part of the group which developed the first Brazilian parallel computer. She has been in the Department of Computer Science of the Fluminense Federal University (UFF) since 1989, where she is now Full Professor. She currently acts in undergraduate and graduate program, advising a number of master and doctoral students. She is a Level 1 Researcher at CNPq (a Brazilian Research Agency), possessing more than 100 publications in journals and proceedings of national and international conferences. Her research interests are parallel and distributed computing, including theory and applications. She has been invited to give talks in Université Paris-Sud, École de Mines, Université d'Avignon et des Pays du Vaucluse, Université Sorbonne, France, where she has also co-advised Ph.D. students.

Jean-Luc Gaudiot: Prof. Jean-Luc Gaudiot received the Diplôme d'Ingénieur from the École Supérieure d'Ingénieurs en Electronique et Electrotechnique, Paris, France in 1976 and the M.S. and Ph.D. degrees in Computer Science from the University of California, Los Angeles in 1977 and 1982, respectively. He is currently Distinguished Professor in the Electrical Engineering and Computer Science Department at the University of California, Irvine where he was department Chair from 2003 to 2009. Priorto joining UCI in January 2002, he was a Professor of Electrical Engineering at the University of Southern California since 1982, where he served as Director of the Computer Engineering Division for three years. He has also designed distributed microprocessor systems at Teledyne Controls, Santa Monica, California (1979-1980) and performed research in innovative architectures at the TRW Technology Research Center, El Segundo, California (1980-1982). He frequently acts as consultant to companies that design high-performance computer architectures and has served as an expert witness in patent infringement and product liability cases. His research interests include programmability of parallel systems, hardware computer security, and design of Autonomous Driving Systems. He has published nearly 300 journal and conference papers. His research has been sponsored by NSF, DoE, and DARPA, as well as a number of industrial organizations. From 2006 to 2009, he was the first Editor-in-Chief of theIE

Chapter. 1. Why move HPC applications to the Cloud?.- Part. I. Foundations.- Chapter. 2. What is Cloud Computing?.- Chapter. 3. What do HPC applications look like?.- Part. II. Running HPC Applications in Cloud.- Chapter. 4. Deploying and Configuring Infrastructure.- Chapter. 5. Executing Traditional HPC Application Code in Cloud with Containerized Job Schedulers.- Chapter. 6. Designing Cloud-friendly HPC Applications.- Chapter. 7. Exploiting Hardware Accelerators in Clouds.- Part III. Cost and Performance Optimizations.- Chapter. 8. Optimizing Infrastructure for MPI Applications.- Chapter. 9. Harnessing Low-Cost Virtual Machines on the Spot.- Chapter. 10. Ensuring Application Continuity with Fault Tolerance Techniques.- Chapter. 11. Avoiding Resource Wastage.- Part. IV. Application Study Cases.- Chapter. 12. Biological Sequence Comparison on Cloud-based GPU Environment.- Chapter. 13. Oil & Gas Reservoir Simulation in the Cloud.- Chapter. 14. Cost effective deep learning on the cloud.- Appendix A. Deploying an HPC cluster on AWS.- Appendix B. Configuring a cloud-deployed HPC cluster.

Erscheinungsdatum	07.07.2024
Zusatzinfo	XV, 334 p.
Verlagsort	Cham
Sprache	englisch
Maße	155 x 235 mm
Themenwelt	Mathematik / Informatik ► Informatik ► Datenbanken
	Mathematik / Informatik ► Informatik ► Netzwerke
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
Schlagworte	Big Data • Bioinformatics • Cloud Computing • Cloud execution cost optimization • Cloud-friendly Applications • Deep learning • Efficient cloud computing • Elastic Applications • High-performance cloud computing • High Performance Computing • MPI Applications • oil and gas industry • Parallel and Distributed Computing • Preemptible Virtual Machines • Scientific Computing
ISBN-10	3-031-29771-7 / 3031297717
ISBN-13	978-3-031-29771-7 / 9783031297717
Zustand	Neuware