Data Science

A First Introduction with Python

Tiffany Timbers, Trevor Campbell, Melissa Lee, Joel Ostblom, Lindsey Heagy (Autoren)

Buch | Softcover

432 Seiten

2024
Chapman & Hall/CRC (Verlag)
978-1-032-57223-9 (ISBN)

Artikel merken

Data Science: A First Introduction with Python focuses on using the Python programming language in Jupyter notebooks to perform data manipulation and cleaning, create effective visualizations, and extract insights from data using classification, regression, clustering, and inference. It emphasizes workflows that are clear, reproducible, and shareable, and includes coverage of the basics of version control. Based on educational research and active learning principles, the book uses a modern approach to Python and includes accompanying autograded Jupyter worksheets for interactive, self-directed learning. The text will leave readers well-prepared for data science projects. It is designed for learners from all disciplines with minimal prior knowledge of mathematics and programming. The authors have honed the material through years of experience teaching thousands of undergraduates at the University of British Columbia.

Key Features:

Includes autograded worksheets for interactive, self-directed learning.
Introduces readers to modern data analysis and workflow tools such as Jupyter notebooks and GitHub, and covers cutting-edge data analysis and manipulation Python libraries such as pandas, scikit-learn, and altair.
Is designed for a broad audience of learners from all backgrounds and disciplines.

Tiffany Timbers is an Associate Professor of Teaching in the Department of Statistics and Co-Director for the Master of Data Science program (Vancouver Option) at the University of British Columbia. In these roles she teaches and develops curriculum around the responsible application of Data Science to solve real-world problems. One of her favourite courses she teaches is a graduate course on collaborative software development, which focuses on teaching how to create R and Python packages using modern tools and workflows. Trevor Campbell is an Associate Professor in the Department of Statistics at the University of British Columbia. His research focuses on automated, scalable Bayesian inference algorithms, Bayesian nonparametrics, streaming data, and Bayesian theory. He was previously a postdoctoral associate in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and Institute for Data, Systems, and Society (IDSS) at MIT and a Ph.D. candidate in the Laboratory for Information and Decision Systems (LIDS) at MIT. Melissa Lee is an Assistant Professor of Teaching in the Department of Statistics at the University of British Columbia. She teaches and develops curriculum for undergraduate statistics and data science courses. Her work focuses on student-centered approaches to teaching, developing and assessing open educational resources, and promoting equity, diversity, and inclusion initiatives. Joel Ostblom is an Assistant Professor of Teaching in the Statistics Department at the University of British Columbia. He teaches and develops data science courses at the graduate and undergraduate level, with a focus on data visualization, data science ethics, and machine learning. Joel cares deeply about spreading data literacy and excitement over programmatic data analysis, which is reflected in his contributions to open source projects and openly accessible data science learning resources. Lindsey Heagy is an Assistant Professor in the Department of Earth, Ocean and Atmospheric Sciences and Director of the Geophysical Inversion Facility at UBC. Her research combines computational methods in numerical simulations, inversions, and machine learning for using geophysical data to characterize the subsurface. Primary applications of interest include mineral exploration, carbon sequestration, groundwater, and environmental studies.

Preface Foreword Acknowledgments 1. Python and Pandas 2. Reading in data locally and from the web 3. Cleaning and wrangling data 4. Effective data visualization 5. Classification I: training & predicting 6. Classification II: evaluation & tuning 7. Regression I: K-nearest neighbors 8. Regression II: linear regression 9. Clustering 10. Statistical inference 11. Combining code and text with Jupyter 12. Collaboration with version control 13. Setting up your computer Bibliography Index

Erscheinungsdatum	26.07.2024
Reihe/Serie	Chapman & Hall/CRC Data Science Series
Zusatzinfo	9 Tables, black and white; 144 Line drawings, color; 3 Line drawings, black and white; 80 Halftones, color; 224 Illustrations, color; 3 Illustrations, black and white
Sprache	englisch
Maße	178 x 254 mm
Gewicht	834 g
Themenwelt	Mathematik / Informatik ► Informatik ► Datenbanken
	Mathematik / Informatik ► Informatik ► Theorie / Studium
	Technik ► Umwelttechnik / Biotechnologie
ISBN-10	1-032-57223-X / 103257223X
ISBN-13	978-1-032-57223-9 / 9781032572239
Zustand	Neuware