Azure Databricks Cookbook

Accelerate cloud data analytics, data governance, Microsoft Fabric, and OpenAI with Azure Databricks

M. Lynne Alanfield, Jeremy Peach, Phani Raj, Vinod Jaiswal (Autoren)

Buch | Softcover

2024 | 2nd Revised edition
Packt Publishing Limited (Verlag)
978-1-80512-382-8 (ISBN)

Artikel merken

Simplify data analytics in Azure for building end-to-end big data solutions, leveraging AI, and working with large datasets in a highly available cloud environment

Key Features

Boost Azure Databricks by building and optimizing Compute Clusters for Effective Solutions
Elevate Data Solutions by leveraging OpenAI and Microsoft Fabric Integration in Azure Databricks
Secure Azure Operations with High Availability, Disaster Recovery, and Unity Catalog Governance

Book DescriptionThe second edition of Azure Databricks Cookbook offers hands-on recipes for ingesting and governing data, building modern data warehouses, and creating innovative AI solutions using Azure Databricks.
Starting with creating an Azure Databricks instance, you'll explore clusters and ingest data from various sources like files, databases, and streaming platforms such as Apache Kafka and EventHub. You'll learn to load data in the Azure Databricks Lakehouse and cover end-to-end data pipelines, utilizing Delta tables and Azure Synapse Analytics for building a modern data warehouse. Enhance your knowledge with OpenAI and Microsoft Fabric integration for superior data solutions. Visualize insights and create dashboards with Databricks SQL and deploy and productionalize data pipelines using CI/CD for Azure Databricks notebooks. The book guides you through setting up Unity Catalog, configuring metastore, catalogs, databases, and tables. It emphasizes ensuring operations continuity with high availability and disaster recovery planning. Finally, you'll explore modernizing workloads with AI and cost-efficient administration. By the end of this book, you will have unlocked transformational insights from data, mastered predictive modeling techniques, and understood development operations best practices to optimize your data solutions.What you will learn

Build a modern data warehouse with Delta Tables and Azure Synapse Analytics
Create real-time dashboards in Databricks SQL
Implemement data governance with Unity Catalog
Build end-to-end data processing pipeline for near real-time data analytics
Integrate Azure DevOps for version control, deploying, and productionizing solutions with CI/CD pipelines
Ensure operations continuity with High Availability and Disaster Recovery planning
Enhance Azure Databricks with OpenAI and Microsoft Fabric integration for cutting-edge data solutions

Who this book is forThis recipe-based book is for data engineers , data scientists, , big data professionals, and machine learning engineers who want to perform data analytics on their massive data sets. Prior experience with Apache Spark and Microsoft Azure is necessary to get the most out of this book.

M. Lynne Alanfield is an experienced and dynamic Principal Cloud Solution Architect. She works with Microsoft's most strategic global customers to discover, migrate and plan business transformation through Microsoft's Azure cloud. She is an award winning subject matter expert in Azure Data Cloud and Artificial Intelligence and a recognized Global Business Intelligence Lead within Microsoft. Melissa is renowned for delivering cutting-edge services for large, global enterprise customers. With a PhD in Cybersecurity and masters and bachelors in MIS, she has an impressive academic background that she supplements through her strong technical writing skills and best-selling author status in non-technical genres. Jeremy Peach is a data scientist with twenty years of experience turning data into insights. As a Senior Cloud Solution Architect at Microsoft, he specializes in Azure Databricks and helps large enterprises around the world solve their toughest analytical challenges using Azure and Apache Spark. He has written several articles on harnessing the power of the cloud to create data science solutions and scaling up analysis of big data. Phani Raj is an experienced data architect and a product manager having 15 years of experience working with customers on building data platforms on both on-prem and on cloud. Worked on designing and implementing large scale big data solutions for customers on different verticals. His passion for continuous learning and adapting to the dynamic nature of technology underscores his role as a trusted advisor in the realm of data architecture ,data science and product management. Vinod Jaiswal is an experienced data engineer, excels in transforming raw data into valuable insights. With over 8 years in Databricks, he designs and implements data pipelines, optimizes workflows, and crafts scalable solutions for intricate data challenges. Collaborating seamlessly with diverse teams, Vinod empowers them with tools and expertise to leverage data effectively. His dedication to staying updated on the latest data engineering trends ensures cutting-edge, robust solutions. Apart from technical prowess, Vinod is a proficient educator. Through presentations and mentoring, he shares his expertise, enabling others to harness the power of data within the Databricks ecosystem.

Table of Contents

Creating an Azure Databricks Workspace
Reading and Writing Data from and to Various Azure Services and File Formats
Reading and Loading Data in the Azure Databricks Lakehouse
Understanding Spark Query Execution
Exploring Delta Lake in Azure Databricks
Working with Streaming Data
Integration with Azure Key-Vault, App Configuration and Log Analytics
Implementing Near-Real-Time Analytics and Building a Modern Data Warehouse
Azure Databricks SQL Analytics
DevOps Integrations and Implementing CI/CD for Azure Databricks
Governing Your Data Estate with Unity Catalog- Setup
Governing Your Data Estate with Unity Catalog- Exploration and Management
Understanding Security and Monitoring in Azure Databricks
Ensuring Operations Continuity with High Availability and Disaster Recovery Planning
Microsoft Fabric & Databricks
Prompt Engineering

Erscheinungsdatum	14.10.2023
Verlagsort	Birmingham
Sprache	englisch
Maße	191 x 235 mm
Themenwelt	Informatik ► Datenbanken ► Data Warehouse / Data Mining
	Mathematik / Informatik ► Informatik ► Netzwerke
	Informatik ► Software Entwicklung ► User Interfaces (HCI)
	Mathematik / Informatik ► Informatik ► Theorie / Studium
ISBN-10	1-80512-382-3 / 1805123823
ISBN-13	978-1-80512-382-8 / 9781805123828
Zustand	Neuware