Practical Hadoop Migration (eBook)

How to Integrate Your RDBMS with the Hadoop Ecosystem and Re-Architect Relational Applications to NoSQL
eBook Download: PDF
2016 | 1st ed.
XXIV, 305 Seiten
Apress (Verlag)
978-1-4842-1287-5 (ISBN)

Lese- und Medienproben

Practical Hadoop Migration -  Bhushan Lakhe
Systemvoraussetzungen
39,99 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

Re-architect relational applications to NoSQL, integrate relational database management systems with the Hadoop ecosystem, and transform and migrate relational data to and from Hadoop components. This book covers the best-practice design approaches to re-architecting your relational applications and transforming your relational data to optimize concurrency, security, denormalization, and performance.

Winner of IBM's 2012 Gerstner Award for his implementation of big data and data warehouse initiatives and author of Practical Hadoop Security, author Bhushan Lakhe walks you through the entire transition process. First, he lays out the criteria for deciding what blend of re-architecting, migration, and integration between RDBMS and HDFS best meets your transition objectives. Then he demonstrates how to design your transition model.

Lakhe proceeds to cover the selection criteria for ETL tools, the implementation steps for migration with SQOOP- and Flume-based data transfers, and transition optimization techniques for tuning partitions, scheduling aggregations, and redesigning ETL. Finally, he assesses the pros and cons of data lakes and Lambda architecture as integrative solutions and illustrates their implementation with real-world case studies.

Hadoop/NoSQL solutions do not offer by default certain relational technology features such as role-based access control, locking for concurrent updates, and various tools for measuring and enhancing performance. Practical Hadoop Migration shows how to use open-source tools to emulate such relational functionalities in Hadoop ecosystem components.


What You'll Learn

  • The requirements and design methodologies of relational data and NoSQL models
  • How to decide whether you should migrate your relational applications to big data technologies or integrate them
  • How to transition your relational applications to Hadoop/NoSQL platforms in terms of logical design and physical implementation
  • RDBMS-to-HDFS integration, data transformation, and optimization techniques
  • The situations in which Lambda architecture and data lake solutions should be considered
  • How to select and implement Hadoop-based components and applications to speed transition, optimize integrated performance, and emulate relational functionalities
Who This Book Is For

The primary readership for Practical Hadoop Migration is database developers, database administrators, enterprise architects, Hadoop/NoSQL developers, and IT leaders. Its secondary readership is project and program managers and advanced students of database and management information systems.


Bhushan Lakhe is Senior Vice President of Information and Data Architecture at Ipsos, a global market research company headquartered in Paris. He has more than 25 years experience in software development life cycle management, enterprise architecture design and framework implementation, service management, data warehousing, and Hadoop ecosystem (HDFS, HBase, Hive, Pig, SQOOP, MongoDB) implementation, having worked successively at Tata Consultancy Services, Fujitsu-ICIM, ICL, IBM, Unisys Corporation, and as a database architecture consultant to such clients as Leo Burnett, ABN AMRO Bank, Abbott Laboratories, Motorola, JPMorgan Chase, and British Petroleum. He received IBM's 2012 Gerstner Award for his implementation of major big data and data warehouse projects. Lakhe is a Cloudera Certified Administrator for Apache Hadoop CDH4 and a Microsoft Certified Technology Specialist, SQL Server Implementation and Maintenance. He is the author of Practical Hadoop Security. He is active in the Chicago Hadoop community and as a speaker at technical meetups and industry conferences. Lakhe graduated from the Birla Institute of Technology and Science, Pilani.
Re-architect relational applications to NoSQL, integrate relational database management systems with the Hadoop ecosystem, and transform and migrate relational data to and from Hadoop components. This book covers the best-practice design approaches to re-architecting your relational applications and transforming your relational data to optimize concurrency, security, denormalization, and performance. Winner of IBM's 2012 Gerstner Award for his implementation of big data and data warehouse initiatives and author of Practical Hadoop Security, author Bhushan Lakhe walks you through the entire transition process. First, he lays out the criteria for deciding what blend of re-architecting, migration, and integration between RDBMS and HDFS best meets your transition objectives. Then he demonstrates how to design your transition model. Lakhe proceeds to cover the selection criteria for ETL tools, the implementation steps for migration with SQOOP- and Flume-based data transfers, and transition optimization techniques for tuning partitions, scheduling aggregations, and redesigning ETL. Finally, he assesses the pros and cons of data lakes and Lambda architecture as integrative solutions and illustrates their implementation with real-world case studies. Hadoop/NoSQL solutions do not offer by default certain relational technology features such as role-based access control, locking for concurrent updates, and various tools for measuring and enhancing performance. Practical Hadoop Migration shows how to use open-source tools to emulate such relational functionalities in Hadoop ecosystem components.What You'll LearnDecide whether you should migrate your relational applications to big data technologies or integrate themTransition your relational applications to Hadoop/NoSQL platforms in terms of logical design andphysical implementationDiscover RDBMS-to-HDFS integration, data transformation, and optimization techniquesConsider when to use Lambda architecture and data lake solutions Select and implement Hadoop-based components and applications to speed transition, optimize integrated performance, and emulate relational functionalitiesWho This Book Is ForDatabase developers, database administrators, enterprise architects, Hadoop/NoSQL developers, and IT leaders. Its secondary readership is project and program managers and advanced students of database and management information systems.

Bhushan Lakhe is Senior Vice President of Information and Data Architecture at Ipsos, a global market research company headquartered in Paris. He has more than 25 years experience in software development life cycle management, enterprise architecture design and framework implementation, service management, data warehousing, and Hadoop ecosystem (HDFS, HBase, Hive, Pig, SQOOP, MongoDB) implementation, having worked successively at Tata Consultancy Services, Fujitsu-ICIM, ICL, IBM, Unisys Corporation, and as a database architecture consultant to such clients as Leo Burnett, ABN AMRO Bank, Abbott Laboratories, Motorola, JPMorgan Chase, and British Petroleum. He received IBM’s 2012 Gerstner Award for his implementation of major big data and data warehouse projects. Lakhe is a Cloudera Certified Administrator for Apache Hadoop CDH4 and a Microsoft Certified Technology Specialist, SQL Server Implementation and Maintenance. He is the author of Practical Hadoop Security. He is active in the Chicago Hadoop community and as a speaker at technical meetups and industry conferences. Lakhe graduated from the Birla Institute of Technology and Science, Pilani.

Chapter 1: RDBMS Meets Hadoop: Integrating, Re-Architecting, and TransitioningPart I: Relational Database Management Systems: A Review of Design Principles, Models, and Best PracticesChapter 2: Understanding RDBMS Design PrinciplesChapter 3: Using SSADM for Relational DesignChapter 4: RDBMS Design and Implementation Tools Part II: Hadoop: A Review of the Hadoop Ecosystem, NoSQL Design Principles and Best PracticesChapter 5: The Hadoop EcosystemChapter 6: Re-Architecting for NoSQL Design Principles, Models, and Best Practices Part III: Integrating Relational Database Management Systems with the Hadoop Distributed File SystemChapter 7: Data Lake Integration Design PrinciplesChapter 8: Implementing SQOOP and Flume-based Data Transfers Part IV: Transitioning from Relational to NoSQL Design ModelsChapter 9: Lambda Architecture for Real-time Hadoop ApplicationsChapter 10: Implementing and Optimizing the TransitionPart V: Case Study for Designing and Implementing a Hadoop-based Solution Chapter 11: Case Study: Implementing Lambda Architecture

Erscheint lt. Verlag 10.8.2016
Zusatzinfo XXIV, 305 p. 99 illus., 61 illus. in color.
Verlagsort Berkeley
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Schlagworte Concurrency • Data Lake • data structures • denormalization • Enterprise Data Warehouse • ETL • Flume • Hadoop • HDFS • Lambda architecture • NoSQL • RDBMS • Sqoop • SSADM
ISBN-10 1-4842-1287-8 / 1484212878
ISBN-13 978-1-4842-1287-5 / 9781484212875
Haben Sie eine Frage zum Produkt?
Wie bewerten Sie den Artikel?
Bitte geben Sie Ihre Bewertung ein:
Bitte geben Sie Daten ein:
PDFPDF (Wasserzeichen)
Größe: 12,6 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Das umfassende Handbuch

von Wolfram Langer

eBook Download (2023)
Rheinwerk Computing (Verlag)
49,90
Das umfassende Handbuch

von Jürgen Sieben

eBook Download (2023)
Rheinwerk Computing (Verlag)
89,90
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Fachbuchverlag
29,99