Beginning Apache Pig - Balaswamy Vaddeman

Beginning Apache Pig (eBook)

Big Data Processing Made Easy
eBook Download: PDF
2016 | First Edition
XXIII, 274 Seiten
Apress (Verlag)
978-1-4842-2337-6 (ISBN)
Systemvoraussetzungen
36,99 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
Learn to use Apache Pig to develop lightweight big data applications easily and quickly. This book shows you many optimization techniques and covers every context where Pig is used in big data analytics. Beginning Apache Pig shows you how Pig is easy to learn and requires relatively little time to develop big data applications.
The book is divided into four parts: the complete features of Apache Pig; integration with other tools; how to solve complex business problems; and optimization of tools.
You'll discover topics such as MapReduce and why it cannot meet every business need; the features of Pig Latin such as data types for each load, store, joins, groups, and ordering; how Pig workflows can be created; submitting Pig jobs using Hue; and working with Oozie. You'll also see how to extend the framework by writing UDFs and custom load, store, and filter functions. Finally you'll cover different optimization techniques such as gathering statistics about a Pig script, joining strategies, parallelism, and the role of data formats in good performance.

What You Will Learn
• Use all the features of Apache Pig
• Integrate Apache Pig with other tools
• Extend Apache Pig
• Optimize Pig Latin code
• Solve different use cases for Pig Latin
Who This Book Is For
All levels of IT professionals: architects, big data enthusiasts, engineers, developers, and big data administrators


Balaswamy Vaddeman, Thinker, Blogger, Serious and Self-motivated Big data evangelist with 9 years of experience in IT and 4 years of experience in Big data space. My Big data experience covers multiple areas like delivery of analytical applications, product development, consulting, training, book reviews, hackathons and mentoring and helping people on forums. I have proved myself while delivering analytical applications in retail, banking and finance domain in 3 aspects (Development, Administration and Architecture) of Hadoop related technologies. At Startup Company, I had developed a Hadoop based product that was used for delivering of analytical applications without writing code.
 In 2013 I had won Hadoop Hackathon event for Hyderabad conducted by Cloudwick technologies. Being top contributor at stackoverflow.com, I helped many people on big data at multiple websites like stackoverflow.com and quora.com. With so much passion on big data I went ahead as independent trainer and consultant to train hundreds of people and to set big data teams in couple of companies.


Learn to use Apache Pig to develop lightweight big data applications easily and quickly. This book shows you many optimization techniques and covers every context where Pig is used in big data analytics. Beginning Apache Pig shows you how Pig is easy to learn and requires relatively little time to develop big data applications.The book is divided into four parts: the complete features of Apache Pig; integration with other tools; how to solve complex business problems; and optimization of tools.You'll discover topics such as MapReduce and why it cannot meet every business need; the features of Pig Latin such as data types for each load, store, joins, groups, and ordering; how Pig workflows can be created; submitting Pig jobs using Hue; and working with Oozie. You'll also see how to extend the framework by writing UDFs and custom load, store, and filter functions. Finally you'll cover different optimization techniques such asgathering statistics about a Pig script, joining strategies, parallelism, and the role of data formats in good performance.What You Will Learn* Use all the features of Apache Pig* Integrate Apache Pig with other tools* Extend Apache Pig* Optimize Pig Latin code* Solve different use cases for Pig LatinWho This Book Is ForAll levels of IT professionals: architects, big data enthusiasts, engineers, developers, and big data administrators

Balaswamy Vaddeman, Thinker, Blogger, Serious and Self-motivated Big data evangelist with 9 years of experience in IT and 4 years of experience in Big data space. My Big data experience covers multiple areas like delivery of analytical applications, product development, consulting, training, book reviews, hackathons and mentoring and helping people on forums. I have proved myself while delivering analytical applications in retail, banking and finance domain in 3 aspects (Development, Administration and Architecture) of Hadoop related technologies. At Startup Company, I had developed a Hadoop based product that was used for delivering of analytical applications without writing code. In 2013 I had won Hadoop Hackathon event for Hyderabad conducted by Cloudwick technologies. Being top contributor at stackoverflow.com, I helped many people on big data at multiple websites like stackoverflow.com and quora.com. With so much passion on big data I went ahead as independent trainer and consultant to train hundreds of people and to set big data teams in couple of companies.

Chapter 1 - Introduction.- Chapter 2 - Data types.- Chapter 3 - Grunt.- Chapter 4 - Introduction to Pig Latin.- Chapter 5 - Joins and Functions.- Chapter 6 - Pig Latin using Oozie.- Chapter 7 - Introduction to HCatalog.- Chapter 8 - Submitting Pig jobs using Hue.- Chapter 9 - Role of Pig in Apache Falcon.- Chapter 10 - Macros.- Chapter 11 - User defined Functions.- Chapter 12 - Writing your own eval and Filter Functions.- Chapter 13 - Writing your own Load and Store Functions.- Chapter 14 - Know Your Pig latin scripts.- Chapter 15 - Data formats.- Chapter 16 - Optimization.- Chapter 17 - Other Hadoop tools.- Appendix A - Builtin Functions.- Appendix B - Apache Pig in Apache Ambari.- Appendix C - HBaseStorage and ORCSTorage options.

Erscheint lt. Verlag 10.12.2016
Zusatzinfo XXIII, 274 p. 69 illus., 35 illus. in color.
Verlagsort Berkeley
Sprache englisch
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Schlagworte Apache Falcon • Apache Pig • Grunt • Hadoop • Hcatalog • HCatalogue • Hue • Macros • Pig Jobs • Pig Latin
ISBN-10 1-4842-2337-3 / 1484223373
ISBN-13 978-1-4842-2337-6 / 9781484223376
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 5,1 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Datenschutz und Sicherheit in Daten- und KI-Projekten

von Katharine Jarmul

eBook Download (2024)
O'Reilly Verlag
24,99