Tree-Based Methods for Statistical Learning in R
Chapman & Hall/CRC (Verlag)
978-0-367-53246-8 (ISBN)
Tree-based Methods for Statistical Learning in R provides a thorough introduction to both individual decision tree algorithms (Part I) and ensembles thereof (Part II). Part I of the book brings several different tree algorithms into focus, both conventional and contemporary. Building a strong foundation for how individual decision trees work will help readers better understand tree-based ensembles at a deeper level, which lie at the cutting edge of modern statistical and machine learning methodology.
The book follows up most ideas and mathematical concepts with code-based examples in the R statistical language; with an emphasis on using as few external packages as possible. For example, users will be exposed to writing their own random forest and gradient tree boosting functions using simple for loops and basic tree fitting software (like rpart and party/partykit), and more. The core chapters also end with a detailed section on relevant software in both R and other opensource alternatives (e.g., Python, Spark, and Julia), and example usage on real data sets. While the book mostly uses R, it is meant to be equally accessible and useful to non-R programmers.
Consumers of this book will have gained a solid foundation (and appreciation) for tree-based methods and how they can be used to solve practical problems and challenges data scientists often face in applied work.
Features:
Thorough coverage, from the ground up, of tree-based methods (e.g., CART, conditional inference trees, bagging, boosting, and random forests).
A companion website containing additional supplementary material and the code to reproduce every example and figure in the book.
A companion R package, called treemisc, which contains several data sets and functions used throughout the book (e.g., there’s an implementation of gradient tree boosting with LAD loss that shows how to perform the line search step by updating the terminal node estimates of a fitted rpart tree).
Interesting examples that are of practical use; for example, how to construct partial dependence plots from a fitted model in Spark MLlib (using only Spark operations), or post-processing tree ensembles via the LASSO to reduce the number of trees while maintaining, or even improving performance.
Brandon M. Greenwell is a data scientist at 84.51° where he works on a diverse team to enable, empower, and enculturate statistical and machine learning best practices where it’s applicable to help others solve real business problems. He received a B.S. in Statistics and an M.S. in Applied Statistics from Wright State University, and a Ph.D. in Applied Mathematics from the Air Force Institute of Technology. He's currently part of the Adjunct Graduate Faculty at Wright State University, an Adjunct Instructor at the University of Cincinnati, the lead developer and maintainer of several R packages available on CRAN (and off CRAN), and co-author of “Hands-On Machine Learning with R.”
1 Introduction 2 Binary recursive partitioning with CART 3 Conditional inference trees 4 "The hitchhiker’s GUIDE to modern decision trees" 5 Ensemble algorithms 6 Peeking inside the “black box”: post-hoc interpretability 7 Random forests 8 Gradient boosting machines
Erscheinungsdatum | 13.06.2022 |
---|---|
Reihe/Serie | Chapman & Hall/CRC Data Science Series |
Zusatzinfo | 5 Tables, black and white; 115 Line drawings, black and white; 1 Halftones, black and white; 116 Illustrations, black and white |
Sprache | englisch |
Maße | 156 x 234 mm |
Gewicht | 740 g |
Themenwelt | Mathematik / Informatik ► Mathematik ► Statistik |
Technik ► Elektrotechnik / Energietechnik | |
Technik ► Umwelttechnik / Biotechnologie | |
ISBN-10 | 0-367-53246-8 / 0367532468 |
ISBN-13 | 978-0-367-53246-8 / 9780367532468 |
Zustand | Neuware |
Haben Sie eine Frage zum Produkt? |
aus dem Bereich