Modern Data Science with R

Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton (Autoren)

Buch | Hardcover

632 Seiten

2021 | 2nd edition
Chapman & Hall/CRC (Verlag)
978-0-367-19149-8 (ISBN)

Artikel merken

This textbook is designed for an undergraduate course in data science that emphasizes topics in both statistics and computer science.

From a review of the first edition: "Modern Data Science with R… is rich with examples and is guided by a strong narrative voice. What’s more, it presents an organizing framework that makes a convincing argument that data science is a course distinct from applied statistics" (The American Statistician).

Modern Data Science with R is a comprehensive data science textbook for undergraduates that incorporates statistical and computational thinking to solve real-world data problems. Rather than focus exclusively on case studies or programming syntax, this book illustrates how statistical programming in the state-of-the-art R/RStudio computing environment can be leveraged to extract meaningful information from a variety of data in the service of addressing compelling questions.

The second edition is updated to reflect the growing influence of the tidyverse set of packages. All code in the book has been revised and styled to be more readable and easier to understand. New functionality from packages like sf, purrr, tidymodels, and tidytext is now integrated into the text. All chapters have been revised, and several have been split, re-organized, or re-imagined to meet the shifting landscape of best practice.

Benjamin S. Baumer is an associate professor in the Statistical & Data Sciences program at Smith College. He has been a practicing data scientist since 2004, when he became the first full-time statistical analyst for the New York Mets. Ben is a co-author of The Sabermetric Revolution and Analyzing Baseball Data with R. He received the 2019 Waller Education Award and the 2016 Significant Contributor Award from the Society for American Baseball Research. Daniel T. Kaplan is the DeWitt Wallace emeritus professor of mathematics and computer science at Macalester College. He is the author of several textbooks on statistical modeling and statistical computing. Danny received the 2006 Macalester Excellence in Teaching award and the 2017 CAUSE Lifetime Achievement Award. Nicholas J. Horton is Beitzel Professor of Technology and Society (Statistics and Data Science) at Amherst College. He is a Fellow of the ASA and the AAAS, co-chair of the National Academies Committee on Applied and Theoretical Statistics, recipient of a number of national teaching awards, author of a series of books on statistical computing, and actively involved in data science curriculum efforts to help students "think with data".

I Part I: Introduction to Data Science. 1. Prologue: Why data science? 2. Data visualization. 3. A grammar for graphics. 4. Data wrangling on one table. 5. Data wrangling on multiple tables. 6. Tidy data. 7. Iteration. 8. Data science ethics. II. Part II: Statistics and Modeling. 9. Statistical foundations. 10. Predictive modeling. 11. Supervised learning. 12. Unsupervised learning. 13. Simulation. III Part III: Topics in Data Science. 14. Dynamic and customized data graphics. 15. Database querying using SQL. 16. Database administration. 17. Working with spatial data. 18.Geospatial computations. 19. Text as data. 20. Network science. IV Part IV: Appendices.

Erscheinungsdatum	16.03.2021
Reihe/Serie	Chapman & Hall/CRC Texts in Statistical Science
Sprache	englisch
Maße	178 x 254 mm
Gewicht	1300 g
Themenwelt	Mathematik / Informatik ► Mathematik ► Computerprogramme / Computeralgebra
Themenwelt	Technik ► Elektrotechnik / Energietechnik
ISBN-10	0-367-19149-0 / 0367191490
ISBN-13	978-0-367-19149-8 / 9780367191498
Zustand	Neuware