Moving Hadoop in the Cloud

Harnessing Cloud Features and Flexibility for Hadoop Clusters

Bill Havanki (Autor)

Buch | Softcover

338 Seiten

2017
O'Reilly Media (Verlag)
978-1-4919-5963-3 (ISBN)

Artikel merken

Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines.

This hands-on guide shows developers and systems administrators familiar with Hadoop how to install, use, and manage cloud-born clusters efficiently. You’ll learn how to architect clusters that work with cloud-provider features—not just to avoid pitfalls, but also to take full advantage of these services. You’ll also compare the Amazon, Google, and Microsoft clouds, and learn how to set up clusters in each of them.

Learn how Hadoop clusters run in the cloud, the problems they can help you solve, and their potential drawbacks
Examine the common concepts of cloud providers, including compute capabilities, networking and security, and storage
Build a functional Hadoop cluster on cloud infrastructure, and learn what the major providers require
Explore use cases for high availability, relational data with Hive, and complex analytics with Spark
Get patterns and practices for running cloud clusters, from designing for price and security to dealing with maintenance

Bill Havanki is a software engineer working for Cloudera, where he has contributed to Hadoop components as well as systems for deploying Hadoop clusters into public Cloud services. Prior to joining Cloudera he worked for 15 years developing software for government contracts, focusing mostly on analytic frameworks and authentication and authorization systems. He earned his B.S. in Electrical Engineering from Rutgers University and his M.S. in Computer Engineering from North Carolina State University. A New Jersey native, he currently lives near Annapolis, Maryland with his family.

Chapter 1 Why Hadoop in the Cloud?
Chapter 2 Overview and Comparison of Cloud Providers
Chapter 3 Instances
Chapter 4 Networking and Security
Chapter 5 Storage
Chapter 6 Setting Up in AWS
Chapter 7 Setting Up in GCP
Chapter 8 Setting Up in Azure
Chapter 9 Standing Up a Cluster
Chapter 10 High Availability
Chapter 11 Relational Data with Apache Hive
Chapter 12 Complex Analytics with Spark
Chapter 13 Pricing and Performance
Chapter 14 Network Topologies
Chapter 15 Patterns for Managing Clusters
Chapter 16 Backup and Restoration

Erscheinungsdatum	21.07.2017
Verlagsort	Sebastopol
Sprache	englisch
Maße	186 x 231 mm
Gewicht	590 g
Einbandart	kartoniert
Themenwelt	Mathematik / Informatik ► Informatik ► Datenbanken
	Mathematik / Informatik ► Informatik ► Netzwerke
	Informatik ► Software Entwicklung ► SOA / Web Services
Schlagworte	Amazon Web Services • AWS • Cloud and Cluster Computing • Hadoop Clusters • Hive • Spark
ISBN-10	1-4919-5963-0 / 1491959630
ISBN-13	978-1-4919-5963-3 / 9781491959633
Zustand	Neuware