Data Science Solutions with Python - Tshepo Chris Nokeri

Data Science Solutions with Python (eBook)

Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn

Tshepo Chris Nokeri (Autor)

eBook Download: PDF

2021 | 1st ed.
XVI, 119 Seiten
Apress (Verlag)
978-1-4842-7762-1 (ISBN)

The book covers an in-memory, distributed cluster computing framework known as PySpark, machine learning framework platforms known as scikit-learn, PySpark MLlib, H2O, and XGBoost, and a deep learning (DL) framework known as Keras.

The book starts off presenting supervised and unsupervised ML and DL models, and then it examines big data frameworks along with ML and DL frameworks. Author Tshepo Chris Nokeri considers a parametric model known as the Generalized Linear Model and a survival regression model known as the Cox Proportional Hazards model along with Accelerated Failure Time (AFT). Also presented is a binary classification model (logistic regression) and an ensemble model (Gradient Boosted Trees). The book introduces DL and an artificial neural network known as the Multilayer Perceptron (MLP) classifier. A way of performing cluster analysis using the K-Means model is covered. Dimension reduction techniques such as Principal Components Analysis and Linear Discriminant Analysis are explored. And automated machine learning is unpacked.

This book is for intermediate-level data scientists and machine learning engineers who want to learn how to apply key big data frameworks and ML and DL frameworks. You will need prior knowledge of the basics of statistics, Python programming, probability theories, and predictive analytics.

What You Will Learn

Understand widespread supervised and unsupervised learning, including key dimension reduction techniques
Know the big data analytics layers such as data visualization, advanced statistics, predictive analytics, machine learning, and deep learning
Integrate big data frameworks with a hybrid of machine learning frameworks and deep learning frameworks
Design, build, test, and validate skilled machine models and deep learning models
Optimize model performance using data transformation, regularization, outlier remedying, hyperparameter optimization, and data split ratio alteration

Who This Book Is For

Data scientists and machine learning engineers with basic knowledge and understanding of Python programming, probability theories, and predictive analytics

Tshepo Chris Nokeri harnesses advanced analytics and artificial intelligence to foster innovation and optimize business performance. In his functional work, he has delivered complex solutions to companies in the mining, petroleum, and manufacturing industries. He initially completed a bachelor's degree in information management. Afterward, he graduated with an Honours degree in business science at the University of the Witwatersrand on a TATA Prestigious Scholarship and a Wits Postgraduate Merit Award. They unanimously awarded him the Oxford University Press Prize.

Apply supervised and unsupervised learning to solve practical and real-world big data problems. This book teaches you how to engineer features, optimize hyperparameters, train and test models, develop pipelines, and automate the machine learning (ML) process. The book covers an in-memory, distributed cluster computing framework known as PySpark, machine learning framework platforms known as scikit-learn, PySpark MLlib, H2O, and XGBoost, and a deep learning (DL) framework known as Keras. The book starts off presenting supervised and unsupervised ML and DL models, and then it examines big data frameworks along with ML and DL frameworks. Author Tshepo Chris Nokeri considers a parametric model known as the Generalized Linear Model and a survival regression model known as the Cox Proportional Hazards model along with Accelerated Failure Time (AFT). Also presented is a binary classification model (logistic regression) and an ensemble model (Gradient Boosted Trees). The book introduces DL and an artificial neural network known as the Multilayer Perceptron (MLP) classifier. A way of performing cluster analysis using the K-Means model is covered. Dimension reduction techniques such as Principal Components Analysis and Linear Discriminant Analysis are explored. And automated machine learning is unpacked.This book is for intermediate-level data scientists and machine learning engineers who want to learn how to apply key big data frameworks and ML and DL frameworks. You will need prior knowledge of the basics of statistics, Python programming, probability theories, and predictive analytics. What You Will LearnUnderstand widespread supervised and unsupervised learning, including key dimension reduction techniquesKnow the big data analytics layers such as data visualization, advanced statistics, predictive analytics, machine learning, and deep learningIntegrate big data frameworks with a hybrid of machine learning frameworks and deep learning frameworksDesign, build, test, and validate skilled machine models and deep learning modelsOptimize model performance using data transformation, regularization, outlier remedying, hyperparameter optimization, and data split ratio alteration Who This Book Is ForData scientists and machine learning engineers with basic knowledge and understanding of Python programming, probability theories, and predictive analytics

Erscheint lt. Verlag	25.10.2021
Zusatzinfo	XVI, 119 p. 35 illus.
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik ► Datenbanken
	Mathematik / Informatik ► Informatik ► Programmiersprachen / -werkzeuge
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
	Mathematik / Informatik ► Mathematik ► Statistik
	Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik
Schlagworte	Big Data Analytics • Deep learning • H2O • Keras • machine learning • MLib • PySpark • Python • Python Frameworks • scikit-learn • xgboost
ISBN-10	1-4842-7762-7 / 1484277627
ISBN-13	978-1-4842-7762-1 / 9781484277621

Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 4,1 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Softcover

35,30 €