Data Science at Scale with Python and Dask
Seiten
2019
Manning Publications (Verlag)
978-1-61729-560-7 (ISBN)
Manning Publications (Verlag)
978-1-61729-560-7 (ISBN)
Large datasets tend to be distributed, non-uniform, and prone to change. Dask simplifies the process of ingesting, filtering, and transforming data, reducing or eliminating the need for a heavyweight framework like Spark.
Data Science at Scale with Python and Dask teaches readers how to build distributed data projects that can handle huge amounts of data. The book introduces Dask Data Frames and teaches helpful code patterns to streamline the reader’s analysis.
Key Features
Working with large structured datasets
Writing DataFrames
Cleaningand visualizing DataFrames
Machine learning with Dask-ML
Working with Bags and Arrays
Written for data engineers and scientists with experience using Python. Knowledge of the PyData stack (Pandas, NumPy, and Scikit-learn) will be helpful. No experience with low-level parallelism is required.
About the technology
Dask is a self-contained, easily extendible library designed to query, stream, filter, and consolidate huge datasets.
Jesse Daniel has five years of experience writing applications in Python, including three years working with in the PyData stack (Pandas, NumPy, SciPy, Scikit-Learn). Jesse joined the faculty of the University of Denver in 2016 as an adjunct professor of business information and analytics, where he currently teaches a Python for Data Science course.
Data Science at Scale with Python and Dask teaches readers how to build distributed data projects that can handle huge amounts of data. The book introduces Dask Data Frames and teaches helpful code patterns to streamline the reader’s analysis.
Key Features
Working with large structured datasets
Writing DataFrames
Cleaningand visualizing DataFrames
Machine learning with Dask-ML
Working with Bags and Arrays
Written for data engineers and scientists with experience using Python. Knowledge of the PyData stack (Pandas, NumPy, and Scikit-learn) will be helpful. No experience with low-level parallelism is required.
About the technology
Dask is a self-contained, easily extendible library designed to query, stream, filter, and consolidate huge datasets.
Jesse Daniel has five years of experience writing applications in Python, including three years working with in the PyData stack (Pandas, NumPy, SciPy, Scikit-Learn). Jesse joined the faculty of the University of Denver in 2016 as an adjunct professor of business information and analytics, where he currently teaches a Python for Data Science course.
Jesse Daniel has five years of experience writing applications in Python, including three years working with in the PyData stack (Pandas, NumPy, SciPy, Scikit-Learn). Jesse joined the faculty of the University of Denver in 2016 as an adjunct professor of business information and analytics, where he currently teaches a Python for Data Science course.
Erscheinungsdatum | 08.11.2018 |
---|---|
Verlagsort | New York |
Sprache | englisch |
Maße | 186 x 235 mm |
Gewicht | 540 g |
Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
ISBN-10 | 1-61729-560-4 / 1617295604 |
ISBN-13 | 978-1-61729-560-7 / 9781617295607 |
Zustand | Neuware |
Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
Haben Sie eine Frage zum Produkt? |
Mehr entdecken
aus dem Bereich
aus dem Bereich
Einführung in die Praxis der Datenbankentwicklung für Ausbildung, …
Buch | Softcover (2021)
Springer Fachmedien Wiesbaden GmbH (Verlag)
49,99 €