Machine Learning Using R - Karthik Ramasubramanian, Abhishek Singh

Machine Learning Using R (eBook)

With Time Series and Industry-Based Use Cases in R

Karthik Ramasubramanian, Abhishek Singh (Autoren)

eBook Download: PDF

2018 | 2nd ed.
XXIV, 700 Seiten
Apress (Verlag)
978-1-4842-4215-5 (ISBN)

As in the first edition, the authors have kept the fine balance of theory and application of machine learning through various real-world use-cases which gives you a comprehensive collection of topics in machine learning. New chapters in this edition cover time series models and deep learning.

What You'll Learn

Understand machine learning algorithms using R
Master the process of building machine-learning models
Cover the theoretical foundations of machine-learning algorithms
See industry focused real-world use cases
Tackle time series modeling in R
Apply deep learning using Keras and TensorFlow in R

Who This Book is For

Data scientists, data science professionals, and researchers in academia who want to understand the nuances of machine-learning approaches/algorithms in practice using R.

Karthik Ramasubramanian has over seven years' experience leading data science and business analytics in retail, FMCG, e-commerce, information technology and hospitality for multi-national companies and unicorn startups. A researcher and problem solver with a diverse set of experience in the data science life cycle, starting from a data problem discovery to creating data science PoCs and products for various industry use cases. In his leadership roles, he has been instrumental in solving many ROI-driven business problems through data science solutions. He has mentored and trained hundreds of professionals and students around the world through various online platforms and university engagement programs in data science.

He has designed, developed and spearheaded many A/B experiment frameworks for improving product features, conceptualized funnel analysis for understanding user interactions and identifying the friction points within a product, and designed statistically robust metrics. On the predictive side, he has developed intelligent chatbots based on deep learning models which understands human-like interactions, customer segmentation models, recommendation systems and many natural language processing models.

His current areas of interest include ROI-driven data product development, advanced machine learning algorithms, data product frameworks, Internet of Things (IoT), scalable data platforms, and model deployment frameworks.

Karthik completed his M.Sc. (Theoretical Computer Science) from PSG College of Technology, Coimbatore (Affiliated to Anna University, Chennai), where he pioneered the application of machine learning, data mining and fuzzy logic in his research work on computer and network security.

Abhishek Singh is on a mission to profess the de facto language of this millennium, the numbers. He is on a journey to bring machine closer to human, for a better and beautiful world around us by generating opportunities with artificial intelligence and machine learning. He leads team of data science professionals who are solving pressing problems in food security, cyber security, natural disaster, healthcare and many more areas, all with help of data and technology. Abhishek is in the process of bringing smart IoT devices to smaller cities in India for people to leverage technology to improve their lives.

He has worked with colleagues from many parts of the USA, Europe and Asia, and strives to work with more people from various backgrounds. In a span of six years at big corporates, he has stress tested the assets of US banks, solved insurance pricing models, and made the telecom experience easier for customers, and is now creating data science opportunities with his team of young minds.

He actively participates in analytics-related thought leadership, writing, public speaking, meet-ups and training in data science. He is staunch supporter of responsible use of AI to remove biases and fair use for a better society.

Abhishek completed his MBA from IIM Bangalore, B.Tech. (Mathematics and Computing) from IIT Guwahati, and PG Diploma (Cyber Law) from NALSAR University, Hyderabad.

Examine the latest technological advancements in building a scalable machine-learning model with big data using R. This second edition shows you how to work with a machine-learning algorithm and use it to build a ML model from raw data. You will see how to use R programming with TensorFlow, thus avoiding the effort of learning Python if you are only comfortable with R.As in the first edition, the authors have kept the fine balance of theory and application of machine learning through various real-world use-cases which gives you a comprehensive collection of topics in machine learning. New chapters in this edition cover time series models and deep learning.What You'll Learn Understand machine learning algorithms using RMaster the process of building machine-learning models Cover the theoretical foundations of machine-learning algorithmsSee industry focused real-world use casesTackle time series modeling in RApply deep learning using Keras and TensorFlow in RWho This Book is ForData scientists, data science professionals, and researchers in academia who want to understand the nuances of machine-learning approaches/algorithms in practice using R.

Karthik Ramasubramanian has over seven years’ experience leading data science and business analytics in retail, FMCG, e-commerce, information technology and hospitality for multi-national companies and unicorn startups. A researcher and problem solver with a diverse set of experience in the data science life cycle, starting from a data problem discovery to creating data science PoCs and products for various industry use cases. In his leadership roles, he has been instrumental in solving many ROI-driven business problems through data science solutions. He has mentored and trained hundreds of professionals and students around the world through various online platforms and university engagement programs in data science. He has designed, developed and spearheaded many A/B experiment frameworks for improving product features, conceptualized funnel analysis for understanding user interactions and identifying the friction points within a product, and designed statistically robust metrics. On the predictive side, he has developed intelligent chatbots based on deep learning models which understands human-like interactions, customer segmentation models, recommendation systems and many natural language processing models. His current areas of interest include ROI-driven data product development, advanced machine learning algorithms, data product frameworks, Internet of Things (IoT), scalable data platforms, and model deployment frameworks. Karthik completed his M.Sc. (Theoretical Computer Science) from PSG College of Technology, Coimbatore (Affiliated to Anna University, Chennai), where he pioneered the application of machine learning, data mining and fuzzy logic in his research work on computer and network security.Abhishek Singh is on a mission to profess the de facto language of this millennium, the numbers. He is on a journey to bring machine closer to human, for a better and beautiful world around us by generating opportunities with artificial intelligence and machine learning. He leads team of data science professionals who are solving pressing problems in food security, cyber security, natural disaster, healthcare and many more areas, all with help of data and technology. Abhishek is in the process of bringing smart IoT devices to smaller cities in India for people to leverage technology to improve their lives. He has worked with colleagues from many parts of the USA, Europe and Asia, and strives to work with more people from various backgrounds. In a span of six years at big corporates, he has stress tested the assets of US banks, solved insurance pricing models, and made the telecom experience easier for customers, and is now creating data science opportunities with his team of young minds. He actively participates in analytics-related thought leadership, writing, public speaking, meet-ups and training in data science. He is staunch supporter of responsible use of AI to remove biases and fair use for a better society. Abhishek completed his MBA from IIM Bangalore, B.Tech. (Mathematics and Computing) from IIT Guwahati, and PG Diploma (Cyber Law) from NALSAR University, Hyderabad.

Chapter 1: Introduction to Machine LearningChapter Goal: This chapter walks through the What, Why, Where and How kind of questions, generally asked by many beginners in Machine Learning. The answers will set the momentum and direction for the chapters to follow. No of pages: 25Sub -Topics1.What does a Machine really learn?2.Why is Machine Learning so popular?3.Where do we use Machine Learning?4.How is Machine Learning changing our way of life?5.Machine Learning Tools and Software6.Machine Learning using RChapter 2: Data Exploration and PreparationChapter Goal: The basis for building a good Machine Learning model is to have a clear understanding and well preparedness of data. This chapter will explain ways to explore the data for understanding and how to deal with the inconsistencies present in the data. No of pages: 50Sub - Topics1.Various Data Formats2.Summary Statistics3.Missing Values4.Data Imputation5.Transforming Unstructured Data Chapter 3: Sampling and Resampling TechniquesChapter Goal: In many real-world dataset, the biggest challenge is the sheer volume of the data. This volume makes the computational limitations more evident for building the Machine Learning Models. In order to reduce the need for computational power and at the same time not compromising the efficacy of the model, this chapter explains some sampling techniques for selecting a smaller dataset from the bigger dataset. We will also explore the idea of resampling which increases the accuracy of many Machine Learning Models.No of pages: 50Sub - Topics: 1.Simple Random Sampling2.Systematic Sampling3.Stratified Sampling4.Cluster Sampling5.Bootstrap samplingChapter 4: Visualization of DataChapter Goal: Visualization is a powerful tool to see through things in our data which might not be very evident when a manual exploration is carried out. This chapter will explain some of the commonly used plots and diagrams to see visually appealing insights coming out from our data. No of pages: 50Sub - Topics: 1.Scatterplot, Histogram and Box Plot2.Heat maps and Waterfall Charts3.Dendrogram for Clustering4.Bubble Chart and Word Cloud5.Sankey Diagrams6.Time Series Graphs7.Cohort DiagramChapter 5: Feature Engineering Chapter Goal: One more challenge in the real world dataset is the number of features it contains. There might be hundreds of feature in a dataset but not all of it is useful for building our model. So, in order to select the features which explain our dataset more than the other features, and hence give a more accurate result, we have certain well proven technique derived from statistics. The feature engineering has now become an unavoidable step in our Machine Learning Model building process.No of pages: 40Sub - Topics:1.Feature Ranking2.Variable Subset Selection 3.Dimensionality ReductionChapter 6: Machine Learning Models: Theory and PracticeChapter Goal: This chapter is the core of this book. After we had the fair understanding of our data and performed the feature engineering, it’s now time to build some really powerful Machine Learning Models. This chapter lists all the ML algorithms under one header. A clear demarcation will be drawn for explaining how each of these ML algorithms are different from each other and which algorithm suits the given use-cases.No of pages: 150Sub - Topics: 1.Linear, Logistic and Polynomial Regression Models2.Decision Tree3.Clustering Algorithms4.Text Mining Approaches5.Neural Networks6.Support Vector Machine7.Association Rule Mining8.Deep Learning9.Online Machine Learning AlgorithmChapter 7: Machine Learning Model EvaluationChapter Goal: At all times, our job doesn’t just end with building a Machine Learning Model but it further goes in evaluating the model's efficacy. A model is considered the best only when it crosses the benchmark accuracy and performs better than the existing models. The significance of evaluating the model increases, even more, when we want to set a common ground of comparing many different models coming out from a research and experimental project.No of pages: 45Sub - Topics: 1.k-fold Cross Validation2.Bootstrap sampling3.ROC Curve4.Accuracy, Precision and Recall5.Sensitivity and Specificity Chapter 8: Model Performance ImprovementChapter Goal: Once we have performed our evaluations, its time to think on how to further improve the model accuracy. And experiences show that, in many cases, we get a significant improvement over accuracy from our base models when we apply methods like Boosting and Ensemble models. This chapter will take a detailed discussion on these methods.No of pages: 60Sub - Topics:1.Parameter Tuning2.Ensemble based ML Model3.Bagging Technique4.Boosting MethodsChapter 9: Time Series ModellingChapter Goal: So far, we have explored the entire ML process flow in good depth along with studying numerous algorithms and approaches. However, in order place this book in a unique fusion of contemporary and legacy techniques from Statistics, Machine Learning and Computer Science, this chapter will touch upon a powerful statistical modeling technique called Time Series. It has its applications in demand-supply planning, stock-market predictions, weather forecast and many other numerous places where one can establish the dependency of a variable with respect to time. Time series models identify the trend, seasonality and random component in the variable of your interest and thus capturing the pattern emerging out from the data from the past to take decision for the future.No of pages: 40Sub - Topics:1.White noise, autoregressive (AR) models, moving average (MA) models, ARMA models 2.Stationarity, differencing, detrending, seasonality3.Dickey-Fuller test for stationarity4.Autocorrelation function (ACF) and partial autocorrelation function (PACF)5.Box-Jenkins methodology for selecting an ARIMA modelChapter 10: Scalable Machine Learning and related technologyChapter Goal: In the concluding chapter, we will discuss some of the contemporary technologies and architectures used for building scalable Machine Learning models. This chapter will give an emphasis on how the Machine Learning algorithms are going through the changes required for accommodating the new Big Data age. And how the new domains likes Data Science is gaining the popularity with just using the classic ML algorithms.No of pages: 80Sub - Topics:1.Introduction to Map Reduce Architecture2.Understanding basics of Apache Hadoop, Hive and Pig3.Integrating Apache Hadoop and R4.Parallel Processing using R5.Machine Learning using Apache Spark and its toolsChapter 11: Introduction to Deep Learning Models using Keras and TensorFlowChapter Goal: Certain problems which were thought to be highly complex and computationally infeasible to be solved by either by sophisticated heuristic or traditional Machine Learning algorithms, are now becoming possible to be solved using Deep Learning (DL) algorithms. Although DL as a subject derives its root from the Neural Network models of Machine Learning, its architecture is trying to mimic the way human brain works. Tasks that we humans do quite effortlessly, like driving a car, processing speech and differentiating apples from oranges requires enormous amount of cognitive ability which we never realize. DL algorithms are getting better in performing such tasks more efficiently than humans now. No of pages: 50Sub - Topics:1.Using Keras and TensorFlow with R2.Overview of RNN, CNN, LSTMs networks3.Question Answering using Memory Network4.Text and Image processing using Keras

Erscheint lt. Verlag	12.12.2018
Zusatzinfo	XXIV, 700 p. 233 illus., 24 illus. in color.
Verlagsort	Berkeley
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik ► Programmiersprachen / -werkzeuge
	Mathematik / Informatik ► Informatik ► Software Entwicklung
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
Schlagworte	Data Exploration • Data Visualization • feature engineering • machine learning • Machine Learning Models • R Programming • Sampling Techniques • scalable machine learning • source code
ISBN-10	1-4842-4215-7 / 1484242157
ISBN-13	978-1-4842-4215-5 / 9781484242155

Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 18,3 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Softcover

69,54 €