Databricks Certified Associate Developer for Apache Spark Using Python
Packt Publishing Limited (Verlag)
978-1-80461-978-0 (ISBN)
Key Features
Understand the fundamentals of Apache Spark to design robust and fast Spark applications
Explore various data manipulation components for each phase of your data engineering project
Prepare for the certification exam with sample questions and mock exams
Purchase of the print or Kindle book includes a free PDF eBook
Book DescriptionSpark has become a de facto standard for big data processing. Migrating data processing to Spark saves resources, streamlines your business focus, and modernizes workloads, creating new business opportunities through Spark’s advanced capabilities. Written by a senior solutions architect at Databricks, with experience in leading data science and data engineering teams in Fortune 500s as well as startups, this book is your exhaustive guide to achieving the Databricks Certified Associate Developer for Apache Spark certification on your first attempt.
You’ll explore the core components of Apache Spark, its architecture, and its optimization, while familiarizing yourself with the Spark DataFrame API and its components needed for data manipulation. You’ll also find out what Spark streaming is and why it’s important for modern data stacks, before learning about machine learning in Spark and its different use cases. What’s more, you’ll discover sample questions at the end of each section along with two mock exams to help you prepare for the certification exam.
By the end of this book, you’ll know what to expect in the exam and gain enough understanding of Spark and its tools to pass the exam. You’ll also be able to apply this knowledge in a real-world setting and take your skillset to the next level.What you will learn
Create and manipulate SQL queries in Apache Spark
Build complex Spark functions using Spark's user-defined functions (UDFs)
Architect big data apps with Spark fundamentals for optimal design
Apply techniques to manipulate and optimize big data applications
Develop real-time or near-real-time applications using Spark Streaming
Work with Apache Spark for machine learning applications
Who this book is forThis book is for data professionals such as data engineers, data analysts, BI developers, and data scientists looking for a comprehensive resource to achieve Databricks Certified Associate Developer certification, as well as for individuals who want to venture into the world of big data and data engineering. Although working knowledge of Python is required, no prior knowledge of Spark is necessary. Additionally, experience with Pyspark will be beneficial.
Saba Shah is a Data and AI Architect and Evangelist with a wide technical breadth and deep understanding of big data and machine learning technologies. She has experience leading data science and data engineering teams in Fortune 500s as well as startups. She started her career as a software engineer but soon transitioned to big data. She is currently a solutions architect at Databricks and works with enterprises building their data strategy and helping them create a vision for the future with machine learning and predictive analytics. Saba graduated with a degree in Computer Science and later earned an MS degree in Advanced Web Technologies. She is passionate about all things data and cricket. She currently resides in RTP, NC.
Table of Contents
Overview of Certification Guide and Exam
Understanding Apache Spark and Its Applications
Spark Architecture & Transformations
Spark Datarames and its Operations
Advanced Operations in Spark
SQL Queries in Spark
Structured Streaming in Spark
Machine Learning with Spark ML
Mock Test
Erscheinungsdatum | 11.01.2024 |
---|---|
Vorwort | Rod Waltermann |
Verlagsort | Birmingham |
Sprache | englisch |
Maße | 191 x 235 mm |
Themenwelt | Informatik ► Datenbanken ► Data Warehouse / Data Mining |
Mathematik / Informatik ► Informatik ► Theorie / Studium | |
Informatik ► Weitere Themen ► Zertifizierung | |
ISBN-10 | 1-80461-978-7 / 1804619787 |
ISBN-13 | 978-1-80461-978-0 / 9781804619780 |
Zustand | Neuware |
Haben Sie eine Frage zum Produkt? |
aus dem Bereich