Practical Implementation of a Data Lake - Nayanjyoti Paul

Practical Implementation of a Data Lake (eBook)

Translating Customer Expectations into Tangible Technical Goals

(Autor)

eBook Download: PDF
2023 | First Edition
XX, 202 Seiten
Apress (Verlag)
978-1-4842-9735-3 (ISBN)
Systemvoraussetzungen
26,99 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

This book explains how to implement a data lake strategy, covering the technical and business challenges architects commonly face. It also illustrates how and why client requirements should drive architectural decisions.

 

Drawing upon a specific case from his own experience, author Nayanjyoti Paul begins with the consideration from which all subsequent decisions should flow: what does your customer need? He also describes the importance of identifying key stakeholders and the key points to focus on when starting a new project. Next, he takes you through the business and technical requirement-gathering process, and how to translate customer expectations into tangible technical goals. From there, you'll gain insight into the security model that will allow you to establish security and legal guardrails, as well as different aspects of security from the end user's perspective. You'll learn which organizational roles need to be onboarded into the data lake, their responsibilities, the services they need access to, and how the hierarchy of escalations should work. Subsequent chapters explore how to divide your data lakes into zones, organize data for security and access, manage data sensitivity, and techniques used for data obfuscation. Audit and logging capabilities in the data lake are also covered before a deep dive into designing data lakes to handle multiple kinds and file formats and access patterns. The book concludes by focusing on production operationalization and solutions to implement a production setup.

 

After completing this book, you will understand how to implement a data lake, the best practices to employ while doing so, and will be armed with practical tips to solve business problems.

 

What You Will Learn

  • Understand the challenges associated with implementing a data lake
  • Explore the architectural patterns and processes used to design a new data lake
  • Design and implement data lake capabilities
  • Associate business requirements with technical deliverables to drive success

 

Who This Book Is For

Data Scientists and Architects, Machine Learning Engineers, and Software Engineers.



Nayanjyoti Paul is an Associate Director and Chief Azure Architect for GenAI and LLM CoE for Accenture. He is the product owner and creator of a patented asset. Presently, he leads multiple projects as a lead architect around generative AI , large language models, data analytics, and machine learning. Nayan is a certified Master Technology Architect, certified Data Scientist, and certified Databricks Champion with additional AWS and Azure certifications. He is a speaker at conferences like Strata Conference, Data Works Summit, and AWS Reinvent. He also delivers guest lectures at Universities.


This book explains how to implement a data lake strategy, covering the technical and business challenges architects commonly face. It also illustrates how and why client requirements should drive architectural decisions. Drawing upon a specific case from his own experience, author Nayanjyoti Paul begins with the consideration from which all subsequent decisions should flow: what does your customer need? He also describes the importance of identifying key stakeholders and the key points to focus on when starting a new project. Next, he takes you through the business and technical requirement-gathering process, and how to translate customer expectations into tangible technical goals. From there, you ll gain insight into the security model that will allow you to establish security and legal guardrails, as well as different aspects of security from the end user s perspective. You ll learn which organizational roles need to be onboarded into the data lake, their responsibilities, the services they need access to, and how the hierarchy of escalations should work. Subsequent chapters explore how to divide your data lakes into zones, organize data for security and access, manage data sensitivity, and techniques used for data obfuscation. Audit and logging capabilities in the data lake are also covered before a deep dive into designing data lakes to handle multiple kinds and file formats and access patterns. The book concludes by focusing on production operationalization and solutions to implement a production setup. After completing this book, you will understand how to implement a data lake, the best practices to employ while doing so, and will be armed with practical tips to solve business problems. What You Will LearnUnderstand the challenges associated with implementing a data lakeExplore the architectural patterns and processes used to design a new data lakeDesign and implement data lake capabilitiesAssociate business requirements with technical deliverables to drive success Who This Book Is ForData Scientists and Architects, Machine Learning Engineers, and Software Engineers.
Erscheint lt. Verlag 3.10.2023
Zusatzinfo XX, 202 p. 58 illus.
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Informatik Programmiersprachen / -werkzeuge
Informatik Theorie / Studium Algorithmen
Informatik Theorie / Studium Künstliche Intelligenz / Robotik
Schlagworte AWS • Cloud Computing • data engineering • Data Lake • Data Mesh • Data Quality • Data strategy • security
ISBN-10 1-4842-9735-0 / 1484297350
ISBN-13 978-1-4842-9735-3 / 9781484297353
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 6,5 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Build memory-efficient cross-platform applications using .NET Core

von Trevoir Williams

eBook Download (2024)
Packt Publishing (Verlag)
29,99
Learn asynchronous programming by building working examples of …

von Carl Fredrik Samson

eBook Download (2024)
Packt Publishing Limited (Verlag)
29,99