Hadoop Operations - Eric Sammer

Hadoop Operations

(Autor)

Buch | Softcover
298 Seiten
2012
O'Reilly Media, Inc, USA (Verlag)
978-1-4493-2705-7 (ISBN)
44,85 inkl. MwSt
If you’ve been asked to maintain large and complex Hadoop clusters, this book is a must. Demand for operations-specific material has skyrocketed now that Hadoop is becoming the de facto standard for truly large-scale data processing in the data center. Eric Sammer, Principal Solution Architect at Cloudera, shows you the particulars of running Hadoop in production, from planning, installing, and configuring the system to providing ongoing maintenance.

Rather than run through all possible scenarios, this pragmatic operations guide calls out what works, as demonstrated in critical deployments.
  • Get a high-level overview of HDFS and MapReduce: why they exist and how they work
  • Plan a Hadoop deployment, from hardware and OS selection to network requirements
  • Learn setup and configuration details with a list of critical properties
  • Manage resources by sharing a cluster across multiple groups
  • Get a runbook of the most common cluster maintenance tasks
  • Monitor Hadoop clusters—and learn troubleshooting with the help of real-world war stories
  • Use basic tools and techniques to handle backup and catastrophic failure

If you've been tasked with the job of maintaining large and complex Hadoop clusters, or are about to be, this book is a must. You'll learn the particulars of Hadoop operations, from planning, installing, and configuring the system to providing ongoing maintenance. Hadoop is being adopted by more and more Fortune 500 companies, and the demand for operations-specific material has skyrocketed. This book - written by Eric Sammer, Principal Solution Architect at Cloudera - is the definitive operations guide for administrators. Developers who want to improve MapReduce jobs by learning how Hadoop works in large production environments will also benefit. Application administrators responsible for the health and operation of large distributed applications or systems will find this guide extremely useful.

Eric Sammer is currently a Principal Solution Architect at Cloudera where he helps customers plan, deploy, develop for, and use Hadoop and the related projects at scale. His background is in the development and operations of distributed, highly concurrent, data ingest and processing systems. He's been involved in the open source community and has contributed to a large number of projects over the last decade.

Chapter 1 Introduction
Chapter 2 HDFS
Goals and Motivation
Design
Daemons
Reading and Writing Data
Managing Filesystem Metadata
Namenode High Availability
Namenode Federation
Access and Integration
Chapter 3 MapReduce
The Stages of MapReduce
Introducing Hadoop MapReduce
YARN
Chapter 4 Planning a Hadoop Cluster
Picking a Distribution and Version of Hadoop
Hardware Selection
Operating System Selection and Preparation
Kernel Tuning
Disk Configuration
Network Design
Chapter 5 Installation and Configuration
Installing Hadoop
Configuration: An Overview
Environment Variables and Shell Scripts
Logging Configuration
HDFS
Namenode High Availability
Namenode Federation
MapReduce
Rack Topology
Security
Chapter 6 Identity, Authentication, and Authorization
Identity
Kerberos and Hadoop
Authorization
Tying It Together
Chapter 7 Resource Management
What Is Resource Management?
HDFS Quotas
MapReduce Schedulers
Chapter 8 Cluster Maintenance
Managing Hadoop Processes
HDFS Maintenance Tasks
MapReduce Maintenance Tasks
Chapter 9 Troubleshooting
Differential Diagnosis Applied to Systems
Common Failures and Problems
“Is the Computer Plugged In?”
Treatment and Care
War Stories
Chapter 10 Monitoring
An Overview
Hadoop Metrics
Health Monitoring
Chapter 11 Backup and Recovery
Data Backup
Namenode Metadata
Appendix Deprecated Configuration Properties
Colophon

Erscheint lt. Verlag 13.11.2012
Zusatzinfo illustrations
Verlagsort Sebastopol
Sprache englisch
Maße 178 x 233 mm
Gewicht 467 g
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Informatik Netzwerke
Mathematik / Informatik Informatik Web / Internet
ISBN-10 1-4493-2705-2 / 1449327052
ISBN-13 978-1-4493-2705-7 / 9781449327057
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Wie bewerten Sie den Artikel?
Bitte geben Sie Ihre Bewertung ein:
Bitte geben Sie Daten ein:
Mehr entdecken
aus dem Bereich
Einführung in die Praxis der Datenbankentwicklung für Ausbildung, …

von René Steiner

Buch | Softcover (2021)
Springer Fachmedien Wiesbaden GmbH (Verlag)
49,99